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Federal  Aviation  Administration  (FAA)  is  planning  to  replace  within 
the  next  ten  years  the  computers  used  to  provide  en  route  air  traffic  control 
services;  in  carrying  out  this  replacement  there  are  numy  different  strategies 
the  FAA  could  follow.  The  purpose  of  this  report  is  to  study  the  strategy 
known  as  rehosting  the  National  Airspace  System  (NAS)  software  on  instruction- 
compatible  machines.  The  idea  is  that  the  current  computers  (and  associated 
peripherals)  would  be  replaced  by  modem  hardware  that  executes  the  same 
machine- language  instructions.  The  current  NAS  software  would  be  changed 
only  insofar  as  proves  necessary  for  the  software  to  run  on  the  new  machines; 
these  changes  to  the  software  are  eTCpected  to  be  minor. 

The  rehosting  strategy  is  evaluated  in  seven  areas.  First,  how  reliable 
is  the  system?  Second,  how  well  will  the  system  perform  under  e;^cted 
workloads?  Third,  how  serious  are  the  technical  obstacles  to  adapting  the 
software  to  run  on  the  new  machines?  Fourth,  what  would  the  new  system 
cost?  Fifth,  what  problems  would  be  encountered  during  the  transition  to 
the  new  system,  sixth,  how  quickly  could  the  system  be  procured?  Seventh, 
how  well  adapted  is  the  system  to  future  growth. 

The  conclusion  is  that  the  rehost  strategy  is  technically  feasible, 
but  there  is  some  uncertainty  about  what  this  strategy  would  cost  and  how 
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The  Federal  Aviation  Administration  (FAA)  is  planning  to  replace  by  1990 
the  computer  systems  used  to  provide  en  route  air  traffic  control  services. 
If*  however*  air  traffic  grows  so  that  the  demand  on  these  computers  exceeds 
their  capacity  before  they  are  replaced*  then  it  will  be  necessary  to  either 
restrict  air  traffic  or  to  adopt  some  interim  system  designed  to  stretch  the 
life  of  the  current  system  by  a  few  years.  This  report  is  one  in  a  series 
that  examines  the  advantages  and  disadvantages  of  the  potential  interim 
systems  in  order  to  provide  the  information  needed  by  FAA  decision-makers. 

The  purpose  of  this  report  is  to  discuss  the  interim  system  achieved  by 
rehosting  the  National  Airspace  System  (NAS)  software  on 
instruction-compatible  machines.  That  is*  the  current  computers* 
collectively  referred  to  as  IBM  9020 *s*  at  each  air  route  traffic  control 
center  (ASTCC)  would  be  replaced  by  modern  machines  that  execute  nearly  the 
same  instruction  set.  This  rehost  system  would  use  the  NAS  software 
currently  used*  with  this  software  only  changed  Insofar  as  necessary  for  it 
to  run  on  the  new  machines.  The  advantage  sometimes  claimed  for  this  system 
is  that  it  would  allow  the  FAA  to  increase  the  capacity  and  reliability  of 
the  en  route  computer  systems  irtiile  avoiding  the  expense  and  risk  of 
completely  new  software. 

The  rehost  system  would  replace  the  current  computers  (including  the 
display  channel  computers) *  tapes*  and  disks.  The  peripheral  adapter 
modules  and  the  controller  suites  up  to  and  including  the  display  generators 
would  not  be  replaced.  The  heart  of  the  rehost  system  would  be  two 
mainframes.  One  mainframe  would  handle  the  processing  now  done  by  the 
central  coa^uter  coag>lex  (OCC)  and  by  the  display  channel;  the  other 
mainframe  would  be  standing  by  ready  to  take  over  the  processing  if  the 
first  mainframe  fails.  The  back-up  mainframe  would  be  able  to  carry  out 
ancillary  processing  tasks  (s.g.*  analyzing  performance  data*  providing 
training  simulations)  while  standing  by. 
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For  conccetancss  this  report  analyzes  one  specific  rehost  systea. 

Several  variants  have  been  sugested,  but  these  are  not  considered  in  detail 
since  it  is  expected  that  they  would  not  significantly  alter  the  analysis. 

The  information  about  the  rehost  system  that  is  relevant  to  the  FAA's 
decision  on  whether  this  system  should  be  procured  is  presented  under  the 
headings  of  cost,  schedule,  reliability,  performance,  technical  issues, 
transition,  and  growth  potential. 

Cost.  Using  the  current  cost  of  providing  en  route  air  traffic  control 
services  as  a  baseline,  the  chance  in  this  cost  that  would  result  from 
rehosting  is  estimated.  Seven  categories  of  cost  are  considered.  First, 
the  cost  of  developing  and  Initially  testing  the  software  is  estimated  to  be 
$5.8  million.  This  cost  covers  the  needed  modifications  to  the  on-line 
software,  the  support  software,  and  the  virtual  machine  monitor.  Second, 
the  cost  of  acquiring  the  hardware  for  23  sites  is  estimated  to  be  either 
$123.8  million  if  Amdahl  470/V7'a  are  purchased  or  $175.1  million  if  IBM 
30330*8  are  purchased.  Since  the  V7  and  the  30330  are  judged  to  be  the  two 
mainframes  that  ace  best  suited  to  rehosting,  the  cost  calculation  is 
carried  out  for  both.  These  figures  include  the  cost  of  mainframes,  tape 
units,  disk  units,  other  peripherals,  and  the  special  hardware  that  is 
needed.  Third,  the  cost  of  testing  the  complete  system  at  the  FAA  Technical 
Center  and  at  the  first  en  route  center  is  estimated  to  be  $6.3  million. 

This  cost  covers  the  testing  that  is  necessary  to  bring  the  system  to  the 
point  where  it  is  ready  to  be  routinely  deployed.  Fourth,  the  cost  incurred 
during  the  transition  period  is  estimated  to  be  $36.9  million.  This  figure 
covers  the  cost  of  remodeling  the  centers,  the  cost  of  the  extra  personnel 
needed  during  transition,  and  the  cost  of  developing  courses  on  the  new 
system  and  teaching  them  to  FAA  personnel.  Fifth,  the  initial  cost  of  spare 
parts  and  documentation  is  expected  to  be  either  $26.7  million  if  V7's  are 
purchased  or  $37.8  million  if  3033O's  are  purchased.  Sixth,  because  of  the 
greater  reliability  of  the  rehost  system,  there  will  be  a  saving  in  the  cost 
of  spare  parts  and  maintenance  personnel.  After  an  initial  shakedown 
period,  the  rehost  system  would  save  an  estimated  $9.3  million  per  year. 
Seventh,  the  FAA  administrative  cost  over  the  six  year  program  is  estimated 
to  be  $41.2  million.  This  covers  program  planning,  management,  and  review. 
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Table  BS-1  suanucizas  all  of  the  front-and  coats  that  aca  Incurcad  to 
gat  tha  cahost  aystan  oparational  at  all  sites.  This  table  shows  each 
category  of  cost  and  indicates  where  it  would  be  incurred;  HQ,  TC,  and  AC 
stand  for  FAA  headquarters,  the  FAA  Technical  Center,  and  the  FAA 
Aeronautical  Center,  respectively.  The  initial  cost  that  would  be  incurred 
if  rehosting  were  adopted  is  estimated  to  be  either  $241.0  million  if  V7's 
are  procured  or  $303.4  million  if  3033O's  are  procured.  All  costs  are  in 
1981  dollars.  Table  BS-2  shorn  how  the  annual  saving  in  the  maintenance 
cost  would  begin  at  about  a  half  million  dollars  in  the  first  year  ’hich 
a  system  is  installed  and  «K>uld  gradually  rise  to  $9.3  million  per  'r. 

The  main  cost  of  rehosting  is  seen  to  be  the  hardware  acquisit  'st, 
which  is  more  than  half  the  initial  cost.  Since  most  of  the  hardw. 
acquired  is  off-the-shelf  equipment,  there  is  relatively  little  uncertainty 
about  this  cost.  Because  of  the  uniqueness  of  the  rehost  problem,  there  is 
considerable  uncertainty  about  the  administrative  costs,  the  software  cost, 
the  spare  parts  cost  (which  night  well  be  overestimated) ,  and  the  transition 
cost  (which  might  well  be  underestimated) . 

These  cost  estimates  assume  that  there  is  replacement  at  all  20 
ARTCC's.  It  is  possible,  however,  that  the  rehost  system  would  be  installed 
only  at  those  centers  that  faced  an  imminent  capacity  problem.  This  partial 
replacement  would  have  the  advantage  of  cutting  down  the  cost  considerably; 
this  saving  largely  results  fron  avoiding  the  hardware  acquisition  cost, 
which  is  the  main  cost,  at  those  centers  at  which  there  is  no  capacity 
problem.  It  is  estimated  that  if  V7*s  are  procured,  then  the  initial  cost 
of  rehostlng  would  be  $107.0  million  if  there  were  replacement  at  five 
centers  and  $151.7  million  if  there  were  replacement  at  ten  centers, 
compared  to  the  coat  of  $241.0  million  if  there  were  replacement  at  all 
twenty  centers.  A  disadvantage  of  partial  replacement  is  that  support  would 
be  complicated  since  two  entirely  different  systems  would  be  in  the  field. 

Schedule.  Once  the  FAA  decided  to  rehost  and  issued  a  request  for 
proposals  (RFF) ,  the  steps  in  the  procurement  and  the  estimated  length  of 
each  step  would  be; 


TABLE  ES-1:  INITIAL  COSTS  INCORSED  BY  REHOSTING  (millions  Of  dollars) 


_ Site _ 

ga  JC  ^  ARTCC»3  Total 

Software  S.8  S.8 

Hardware 

En9 ineer ing  0.3  0.3 

Acquisition 


V7 

10.8 

5.4 

107.6 

123.8 

30330 

15.2 

7.6 

152.3 

175.1 

Testing 

3.8 

2.5 

6.3 

Maintenance 

Initial  cost 


V7 

0.9 

25.8 

26.7 

30330 

1.2 

36.6 

37.8 

Transition  Cost 

Remodeling 

2.0 

1.0 

20.0 

23.0 

Extra  personnel 

4.0 

4.0 

Developing  courses 

1.8 

1.8 

Teaching  courses 

0.6 

0.1 

7.4 

8.1 

Program  management 

and  Support  41.2  41.2 


Total 

V7  241.0 

30330  303.4 


X 


TABLE  ES>2:  AMMOAL  NAZHTEMANCE  COST  SAVING  PROVIDED  BY  REB08TING 


T«at  Saving  (mllllona) 


1  $0,506 

2  2.387 

3  5.135 

4  7.883 

5  and  after  9.257 


e  industry  prepares  proposals  (3  months) ; 

e  FAA  evaluates  the  proposals  and  awards  a  contract  (6  months) r 
e  contractor  develops  hardware  and  software  (21  months) ; 
e  FAA  and  contractor  teat  a  system  at  the  FAA  Technical  Center  (9 
months) ; 

e  FAA  and  contractor  test  a  system  at  the  first  field  site  (6  months) ; 
e  contractor  installs  systems  at  the  remaining  sites  (24  months) . 

Therefore >  from  the  time  that  an  RFP  is  issued  to  the  time  that  the  system 
is  operational  at  the  first  field  site,  there  is  an  elapsed  time  of  45 
months  (3  years,  9  months) .  From  the  time  an  RFP  is  issued  until  the  system 
is  operational  at  all  sites,  69  months  (5  years,  9  months)  elapses.  This 
means  that  if  an  RFP  is  issued  on  1  July  1982,  the  first  system  will  be 
operational  at  an  en  route  center  on  1  April  1986,  and  the  system  will  be 
operational  at  all  centers  on  1  April  1988. 

One  suggested  rehosting  approach  differs  from  the  approach  considered 
here  by  retaining  the  disk  and  tape  drives,  by  making  fewer  software 
changes,  and  by  incurring  a  greater  processing  overhead.  This  approach 
would  reduce  the  development  time  by  an  estimated  12  to  15  months  and  would 
reduce  the  cost  by  an  estimated  $25  millioni  there  would,  however,  be 
greater  uncertainty  over  the  schedule  and  cost. 
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Zt  has  also  baan  sug^aatad  that  tha  aainfraMS  should  ba  laasad  rathar 
than  pucchasad  to  shortan  tha  pcocuraaant  cycla  and  to  sava  nonay;  this 
approach  has  thraa  problaas.  First,  sinca  davaloping  the  systaa  rathar  than 
acquiring  the  aainfraMS  is  tha  bottlanack  in  tha  procursMnt,  leasing  would 
not  spaed  tha  procuraswnt.  Second,  since  thraa  years  is  typically  tha 
braak-aven  point  for  a  laasa  and  sinca  these  coaputars  would  probably  ba  in 
place  for  acre  than  three  years,  leasing  would  probably  and  up  costing  acre 
rather  than  less.  Third,  the  user  typically  is  not  allowed  to  aaintain 
leased  coaputera,  and  this  iMuld  interfere  with  the  FAA's  providing  the  type 
of  aaintenanca  required  by  air  traffic  control.  (It  should  be  reaarked, 
however,  that  because  of  changes  in  technology  the  FAA  would  probably  do 
less  of  the  maintenance  for  the  rahoat  systea*  for  example,  the  FAA  might 
use  the  manufacturer's  remote  diagnostic  services.) 

Reliability.  The  FAA's  goal  is  to  have  a  system  with  extremely  high 
availability,  i.e.,  a  system  that  supports  air  traffic  control  with  minimal 
interruptions  in  service.  The  types  of  failures  that  can  beset  the  system 
are  hardware,  software,  and  personnel  failures. 

In  discussing  hardware  failures  it  is  essential  to  distinguish  between  a 
component  failure  and  a  system  failure.  For  example,  a  90200  has  three 
compute  elements  (CE's),  two  of  which  are  active  and  one  of  which  is 
redundant.  If  one  of  the  active  CE's  fails,  the  system  is  automatically 
reconfigured  so  that  the  redundant  CE  is  made  active;  with  two  active  CE's, 
there  is  no  system  failure  even  though  a  component  has  failed.  A  system 
failure  occurs  only  if,  before  the  CE  that  failed  is  repaired  or  replaced, 
one  of  the  other  CE's  fails.  Therefore,  because  of  the  redundancy  built 
into  the  system,  a  single  component  failure  does  not  cause  a  system  failure; 
a  system  failure  only  results  when  the  number  of  failed  components  exceeds 
the  number  of  redundant  coatponents.  Redundancy,  then,  can  lessen  but  not 
completely  eliminate  system  failures.  The  rehost  system  would  decrease  the 
time  it  takes  to  reconfigure  the  system  when  a  component  fails;  the  rehost 
system  would  also  decrease  the  frequency  of  system  failures.  These 
improvements  would  reduce  the  uncertainty  that  controllers  now  have  about 
the  ability  of  the  system  to  quickly  and  completely  recover  from  a  failure. 
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The  relative  availability  of  the  9020  and  rehost  systeas  is  studied  by 
combining  information  about  the  redundancy  built  into  each  system  with 
information  about  the  mean  time  between  failure  (MTBF)  and  the  mean  repair 
time  of  each  component.  A  system  failure  occurs  in  tJic  9020D  if  at  least 
one  of  the  following  conditions  is  violated! 

#  at  least  2  of  the  3  compute  elements  are  working; 

e  at  least  5  of  the  6  storage  elements  are  working; 

e  at  least  2  of  the  3  input/output  control  elements  are  working; 

e  at  least  2  of  the  3  tape  control  units  are  working; 

e  at  least  2  of  the  3  disk  control  units  are  working. 

In  the  rehost  system  a  mainframe  is  said  to  contain  a  C90,  a  memory,  and  12 
channels  divided  into  6  pairs.  A  mainframe  is  working  if  the  CPO  is 
working,  if  the  memory  is  irarklng,  and  if  at  least  one  channel  in  each  pair 
is  working.  A  system  failure  occurs  in  the  rehost  system  if  at  least  one  of 
the  following  conditions  is  violated; 

e  at  least  1  of  the  2  mainframes  is  working; 
e  at  least  1  of  the  2  tape  control  units  are  working; 

e  at  least  1  of  the  2  disk  control  units  are  working. 

Once  assumptions  are  made  about  the  NTBF's  of  each  component  and  the 
mean  time  to  repair,  the  system  availability  and  HTBF  are  calculated;  these 
results  are  shown  in  Table  ES-3  for  the  rehost  system  and  for  a  system  with 
a  90200  in  the  CCC  and  a  9020E  in  the  display  channel.  The  mean  time 
between  system  failures  is  estimated  to  be  2905  days  for  the  rehost  hardware 
and  1226  days  for  the  9020D/9020B  hardware.  This  greater  reliability  of  the 
rehost  hardware  stems  largely  from  its  being  configured  in  parallel;  that 
is,  if  either  mainframe  fails,  the  system  does  not  fail  since  the  remaining 
mainframe  can  carry  the  entire  load.  The  9020D/9020B  system.,  in  contrast, 
is  configured  in  series;  that  is,  the  system  operates  only  if  both  the  9020D 
and  the  9020B  operate.  There  is  a  good  deal  of  uncertainty  about  the 
accuracy  of  these  estimates  because  the  data  available  for  determining  the 
component  MrBF's  and  repair  times  were  sketchy.  Therefore,  a  sensitivity 
analysis  was  carried  out,  and  it  was  found  that  the  rehost  hardware  retained 
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TABLE  ES-3t  ESTIMATED  AVAILABILITY  AMD  NTBF  OF  SYSTEM  HABONAEE 


Syaf ■ 

9020D/9020E 

RahOBt 


Availability 

0.99998301 

0.99999283 


MTBF  (dava) 
1226 
2905 


Its  reliability  advantage  Cor  alternate  values  of  the  conponent  MTBF 'a  and 
repair  times.  In  suMf  while  the  absolute  numbers  might  be  questioned,  the 
conclusion  Is  that  the  rehost  system  does  exhibit  greater  hardware 
reliability  because  the  results  are  so  lopsided  in  favor  of  the  rehost 
system  and  because  of  the  persistence  of  this  finding  throughout  the 
sensitivity  analysis. 

Now  turn  to  the  topic  of  software  reliability.  The  rehost  system  has 
three  major  software  components:  the  NAS  application  software,  the  MAS 
monitor,  and  the  virtual  machine  monitor.  During  the  testing  phase  it  is 
expected  that  new  problems  would  arise  with  the  NAS  monitor  and  application 
code,  but  by  the  time  the  rehoat  system  is  operational  it  is  expected  that 
these  two  components  will  return  to  their  present  level  of  reliability.  In 
fact,  because  the  NAS  software  will  all  be  memory-resident,  the  swapping  of 
code  in  and  out  of  main  memory  will  be  eliminated.  Because  swapping  and 
table  size  limitations  are  a  significant  source  of  software  failures,  the 
NAS  software  can  be  expected  to  be  more  reliable  under  rehosting.  The 
virtual  machine  monitor  would  be  a  source  of  new  software  failures. 
Therefore,  under  rehosting  there  will  be  a  decrease  in  failures  because 
swapping  is  eliminated  and  an  increase  in  failures  because  of  the  virtual 
machine  monitor.  It  is  impossible  to  quantify  the  net  effect  on  software 
reliability,  but  it  can  be  concluded  that,  at  worst,  the  rehost  system  will 
have  only  slightly  lower  software  reliability. 

In  order  to  estimate  the  overall  availability  and  MTBF  of  the  system 
where  both  hardware  and  software  failures  are  considered,  the  assumptions 
made  about  the  hardware  are  supplemented  with  tentative  assumptions  about 
the  frequency  and  duration  of  software  failures.  Table  BS-4  shows  the 
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TABLE  ES>4t  BSTZNATEO  AVAILABILITY  AMD  MTBP  OP  SYSTEM  BABONABE  AMD  SQPTHABE 


Sy«t«i  Avllabllltv  MTBP  (dav«) 

9020D/9020B  0.99998191  613 

B«hoat  0.99998922  1420 

estlaatad  availability  and  MTBP  Cor  aach  systaa.  It  la  aaan  that  tha  cahoat 
ayataai  cataina  ita  adga  in  caliability  avan  whan  aoftwara  Cailuraa  ara  takan 
into  account  with  an  MTBP  oC  1420  daya  va.  an  MTBP  of  613  daya  for  tha 
90200/9020B  ayataa.  (It  abould  ba  a^haaisad  that  thara  ara  aoaa  Cailuraa, 
a. 9.,  Cailuraa  cauaad  by  huaan  arror,  that  ara  not  capturad  by  thia 
analyala.  TharaCora,  tha  nuabara  in  Tablaa  BS-3  and  BS-4  ahould  ba 
intarpratad  only  aa  ralativa  indicatora  oC  tha  raliability  of  tha  ttio 
ayataaa.  In  othar  worda,  thia  analyala  doaa  not  allow  ona  to  aay  in 
abaoluta  taraa  what  tha  ayataa  raliability  ia,  but  it  doaa  giva  a  coaoK>n 
baaia  Cor  coaparing  tha  two  ayataaa.  PAA  data  indicataa  that  thara  ara 
parhapa  two  or  thraa  ayataa  Cailuraa  at  aach  an  routa  cantar  par  aonth.) 

In  auaaary,  if  tha  currant  ayataa  wara  raplacad  by  tha  rahoat  ayataa, 
thara  would  ba  both  a  quantitativa  and  a  qualitativa  changa  in  tha 
raliability.  Quantitativaly,  thara  would  ba  a  raduction  in  tha  nuabar  of 
ayataa  Cailuraa.  Qualitativaly,  tha  ahortar  racovary  tiaaa  and  graatar 
pradictabillty  of  tha  rahoat  ayataa  would  aaan  that  controllara  would  have 
laaa  uncartainty  about  how  long  an  intarruption  in  aarvica  would  laati  thia 
would  dacraaaa  tha  diaruption  cauaad  by  abort  outagaa. 

ParCoraanca.  Tha  rahoat  ayataa  is  only  of  intarast  to  tha  PAA  if  it  ia 
abla  to  adaquataly  handla  tha  workload  ovar  its  axpactad  liCa.  In  ordar  to 
study  tha  quastion  of  wtaathar  tha  rahoat  systaa  would  parCora  adaquataly, 
this  raport  Cocusas  on  tha  systaa  rasponsa  tiaw.  Tha  Idas  is  that  tha 
systaa  is  constantly  racaiving  inputs  such  as  radar  data  and  aassagas  Croa 
tha  controllara;  tha  systaa's  aain  job  is  to  aaka  sura  that  thasa  inputs  ara 
raClactad  on  tha  controllar's  scraan  in  a  tiaaly  aannar.  It  parCoraanca  is 
inadaquata,  than  tha  systaa  will  Call  bahind  tha  straaa  of  Inputs  and  tha 
controllar's  scraan  will  bacoaa  out  oC  data.  TharaCora,  tha  systaa  rasponsa 
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tlMf  oc  th«  tiM  it  t«k«s  an  input  to  b«  raflactad  on  tba  contcolloc's 
sccaan,  is  tba  calavant  way  to  aaasuca  parforaanca. 

Two  diffacant  analysas  of  parforaanca  ara  carriad  out.  Tba  first  is  a 
rough  global  analysis  wbicb  axaainas  tba  dagraa  to  wblcb  aodarn  tacbnology 
surpassas  that  aabodiad  in  tba  9020 's  and  which  concludas  eonsarvativaly 
that  tha  rahost  syataa  as  a  whola  will  hava  a  sarviea  tiiM  half  that  of  tha 
currant  systaa.  Froa  a  calculation  basad  on  quauaing  dalays  it  is  concludad 
that  awan  if  tha  workload  wara  to  doubla,  tha  rahost  systaa  would  still  hava 
a  rasponae  tiaa  half  that  of  tha  currant  rasponsa  tiaa. 

Tha  aacond  analysis  usas  siapla  oparational  analysis  tachniquas  to  infar 
systaa  rasponsa  tiaa  froa  tba  sarviea  tiaa  and  utilisation  of  individual 
coaponanta.  This  analysis  procaada  in  savan  staps.  First,  charaetarixa  tha 
typical  transaction  (which  is  a  requast  for  sarviea  such  as  a  controllar 
asking  for  inforaation)  in  taras  of  the  workload  it  iaposas  on  each 
coaponant.  Second,  asauaa  an  arrival  rata  for  tha  transactions.  Third, 
infer  the  parcantaga  utilization  of  each  coaponant  such  as  CFO,  disk,  and 
channel.  Fourth,  dataraina  tha  rasponsa  tiaa  for  each  coaponant,  using  tha 
aquation  that  tha  response  tiaa  equals  tha  sarviea  tiaa  divided  by  one  ainus 
tha  utilization.  (Service  tiaa  is  tha  tiaa  it  would  taka  to  process  a 
transaction  if  there  wara  no  congestion.)  Fifth,  add  tha  rasponsa  tiaas  for 
tha  CFO,  disks,  and  channels  to  obtain  what  is  taraad  tha  active  server 
tiaa.  Sixth,  dataraina  the  dalays  due  to  data  base  locks  and  non-reentrant 
prograa  alaaant  locks i  this  is  called  tha  passive  server  tiaa.  Seventh,  add 
tba  active  and  passive  server  tiaas  to  ^tain  the  total  systaa  response  tiaa. 

Tha  results  ara  shown  in  Table  ES-5.  This  table  shows  for  a  variety  of 
track  counts  tba  total  systaa  response  tiaa.  Tha  first  two  coluans  show 
that  with  a  track  count  of  110  tha  9020Ji  and  tha  rahost  systaa  hava 
estiaatad  systaa  response  tiaas  of  6,73S  and  128  aillisaconds, 
respectively.  Tba  rest  of  the  table  shows  that  tha  rahost  systaa  aaintains 
an  acceptable  rasponsa  tiaa  for  tha  peak  track  counts  projected  through 
1995.  Throughout  tha  analysis  tha  assuaptions  adopted  ara  conservative  and 
chosen  to  aaka  sure  that  tha  response  tiaa  of  tha  rahost  systaa  is  not 


TABLE  BS-5:  PERFORMANCE  OF  TBE  9020A  AND  REHOST  SYSTEMS 

_ Raho«t  Svt— 


9020A 

Baaa 

1980 

1985 

1990 

1995 

Tcack  count 

110 

110 

319 

384 

486 

597 

CPO  utilization 

.73 

.15 

.44 

.52 

.66 

.81 

CPO  cesponsa  tizw  (as) 

2,263 

49 

74 

87 

123 

224 

Diak  utilization 

.38 

.12 

.35 

.42 

.53 

.65 

Disk  casponaa  tiaa  (aa) 

340 

75 

102 

114 

141 

190 

Channal  utilization 

.43 

- 

> 

- 

- 

- 

Channel  caaponaa  tiaa  (aa) 

91 

- 

- 

- 

- 

- 

Total  actiaa  aatvac  tiaa  (aa) 

2,694 

124 

175 

201 

264 

414 

PE  utilization 

.60 

.028 

.11 

.16 

.26 

.51 

Ovacall  caaponaa  tiaa  (aa) 

6,735 

128 

197 

239 

357 

845 

N.B.  Thia  table  aaauaaa  that 

an  IBM 

3033D 

or  an 

Aadahl  V7 

ia  uaad 

aa  the 

c«hostln9  Mlnfraa*. 
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und«c«Btinated.  ThccaCor^.  though  th«  analysis  is  tantativa,  it  doas 
strongly  suggast  that  tha  rahost  systaai  can  parCora  aora  than  adequately. 

Technical  issues.  Can  tha  MAS  software  be  aada  to  run  successfully  on 
tha  rahost  aachina?  Thera  are  t%fo  aspects  to  this  question. 

First,  tha  9020 's  execute  about  fifteen  special  instructions  that  are 
not  standard  Sy8tea/360  instructions  and  that  could  not  be  executed  by  the 
rehost  aachine.  There  are  a  nuaber  of  different  methods  that  could  be  used 
to  deal  with  these  instructions;  the  discussion  indicates  how  these 
instructions  could  be  handled  by  trapping  and  emulating  the  instructions,  by 
changing  the  operation  code,  or  by  doing  nothing  since  the  instruction  would 
not  be  executed  in  the  rehost  system. 

Second,  a  number  of  features  of  tha  9020  anvironaent  pose  potential 
probleas  for  the  rehost  system.  These  problems  pertain  to  memory  usage 
(relating  to  page  zero,  storage  keys,  imaadiata  instructions,  and  memory 
size) ,  timer  usage  and  synchronization,  program  status  word  format,  devices 
and  channel  program  usage,  and  diagnosis  and  error  analysis.  The  details  of 
these  probleas  and  possible  methods  of  dealing  with  them  are  discussed. 

The  conclusion  is  that  the  technical  problems  of  rebosting  the  NAS 
software  can  be  readily  dealt  with.  While  the  methods  sketched  out  might 
not  be  the  best,  they  do  at  least  show  that  suitable  methods  do  exist. 

Transition.  The  FAA  has  established  the  requirement  that  when  the  new 
aachine  is  installed,  there  should  be  no  significant  interruption  in  the 
seven  days  a  week,  twenty-four  hours  a  day  provision  of  air  traffic  control 
services.  Moreover,  there  must  be  a  ninety  day  period  in  which  both  the  old 
and  new  systems  are  operating  so  that  there  will  be  a  proved  back-up  to  the 
new  system,  in  order  to  achieve  these  transition  goals  three  problems  must 
be  dealt  with. 

First,  there  are  remodeling  problems  since  the  site  would  have  to  be 
prepared  for  the  new  system.  Second,  there  are  technical  problems  since 
cables  must  be  connected  so  that  inputs  can  be  directed  to  either  system  and 
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80  outputs  can  b*  supplied  Ceob  either  systea;  the  technical  pcoblea  of  how 
the  old  systea  can  take  over  if  the  new  systea  fails  aust  also  be  dealt 
with.  Third,  there  are  personnel  probleas  since  the  training  of  personnel 
aust  be  scheduled  so  that  the  lag  between  training  and  when  the  new  systea 
is  installed  is  ainiaized;  the  training  schedule,  however,  aust  prevent  the 
center  froa  being  underaanned  at  any  tiae.  No  detailed  transition  plan  was 
developed,  but  an  analysis  of  tranaition  issues  did  not  uncover  any  serious 
difficulties.  It  can  be  concluded  that  potential  probleas  can  be  avoided  by 
advance  thinking  and  careful  preparation. 

Growth  potential.  In  order  to  ainiaize  the  trauaa  of  transition  and  to 
avoid  the  expense  of  repeated  replaceaent,  a  systea  that  can  gradually 
evolve  through  tiae  is  desired.  There  are  three  ways  that  a  systea  should 
be  able  to  evolve.  First,  it  should  be  able  to  be  upgraded  to  include  new 
technology  as  that  technology  becoaes  available.  Second,  it  should  be  able 
to  increase  its  capacity  as  the  load  on  the  systea  aakes  additional  capacity 
necessary.  These  first  two  criteria  are  aainly  related  to  hardware,  and 
they  are  aet  since  the  rehost  systea  consists  of  standard,  off-the-shelf 
hardware.  For  exaaple,  an  Aadahl  V7  can  be  field-upgraded  to  a  VS  over  a 
weekend.  Therefore,  aainfraaes  can  be  upgraded,  aeaory  can  be  added  up  to 
16  aegabytes,  and  peripherals  can  be  replaced  as  desired  without  seriously 
interrupting  air  traffic  control  services.  In  this  way  the  systea  can 
reflect  current  technology  and  offer  increased  capacity. 

The  third  type  of  evolution  is  that  the  systea  should  be  able  to  provide 
additional  functions  as  the  scope  of  air  traffic  control  changes.  For 
exaaple,  the  systea  should  be  able  to  handle  the  increasing  levels  of 
autoaation  that  are  being  introduced.  This  criterion  is  largely  related  to 
software.  Since  the  reboat  systea  uses  the  NAS  software,  which  does  not 
reflect  aodern  prograaaing  practices,  gradual  changes  to  the  NAS  software 
would  be  difficult;  in  this  sense,  then,  functional  evolution  of  the  systea 
would  not  proceed  saoothly.  Once  the  rehost  systea  is  operating,  however, 
the  software  could  be  totally  rewritten,  and  in  this  way  the  rehost  systea 
could  be  put  into  a  fora  that  could  evolve  to  satisfy  growth  in  air  traffic 
and  to  support  fully  autoaatad  air  traffic  control.  (One  outstanding  issue 


is  whether  the  rehoet  eyitea  could  provide  the  level  of  reliability  and 

availability  that  is  needed  in  the  long  tera.)  i 

i ' 
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Suaaarv.  The  question  which  the  PAA  is  preparing  to  ansmr  ist  How 
should  the  procureaent  of  a  coaputer  systui  to  replace  the  9020 's  proceed? 

This  report  does  not  atteapt  to  address  this  entire  question*  it  only  looks 
at  the  pros  and  cons  of  the  strategy  of  rehosting  the  NAS  software  on 
instruction-coapatible  aachines.  This  report  finds  that  the  rehost  systea 
could  provide  iaproved  levels  of  reliability  and  perforaance,  and  the  risks 
due  to  possible  technical  probleas  and  the  transition  appear  to  be 
acceptable.  The  conclusion  is  that  the  rehost  systea  is  a  suitable  systea 
to  adopt.  This  is  not  to  say  that  the  rehost  systea  is  the  best  systea  or 
that  it  should  be  adopted*  this  stateaent  would  require  one  to  look  not  only 
at  the  rehost  systea  but  also  at  alternate  systeas,  and  this  is  beyond  the 

scope  of  this  report.  I 
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1.  REHOSTING  THE  NAS  SOFTNARE 


I.l  BacAqcound.  PurPO««.  «nd  Organization  of  this  R«Port 

One  of  the  ■issione  of  the  Federal  Aviation  Administration  (FAA)  is  to 
provide  en  route  air  traffic  control  services.  To  fulfill  this  mission  the 
FAA  has  established  in  the  continental  U.S.  twenty  air  route  traffic  control 
centers  (ARTCC's)*  each  equipped  with  computer  systems  that  collect, 
transfer,  and  process  the  data  that  are  used  to  keep  current  the  displays 
and  print  the  flight  strips  used  by  the  air  traffic  controllers.  These 
computer  systems,  along  with  all  associated  hardware  and  software,  are  known 
as  the  automation  systems  of  the  National  Airspace  System  (NAS) . 

These  computer  systems  have  been  in  place  and  supporting  air  traffic 
control  (ATC)  for  the  last  ten  years  and  can  be  expected  to  provide 
effective  support  for  some  time  to  come.  If  air  traffic  increases  as 
forecast,  however,  there  will  eventually  comm  a  time  when  these  systems 
approach  saturation  and  will  not  be  able  to  keep  the  controllers'  displays 
sufficiently  up-to-date.  Even  if  traffic  does  not  Inc^ 'lase  as  forecast,  the 
age  of  the  system,  the  increasing  difficulty  of  acquiring  spare  parts,  or 
the  desire  to  have  a  system  with  greater  capability  means  that  the  system 
will  be  replaced  in  the  not  too  distant  future. 

The  FAA’s  current  plan  is  to  fully  replace  the  current  system  by  1990 
with  the  Advanced  Co^niter  System  (FAA80,  p.l6] .  If  it  turns  out  that  the 
current  computer  system  cannot  keep  up  with  the  growth  in  air  traffic  that 
takes  place  before  1990,  then  the  options  are  to  either  restrict  air  traffic 
to  a  level  that  the  current  system  can  handle  or  to  adopt  an  interim, 
short-term  system  that  will  stretch  the  life  of  the  current  system  until 
full  replacement  can  be  carried  out.  The  FAA  is  currently  studying  a  number 
of  potential  interim  systems. 

The  purpose  of  this  report  is  to  examine  the  interim  system  that  results 
when  the  current  NAS  software  is  rehosted  on  an  instruction-compatible 
machine.  That  is,  the  hardware  would  be  replaced  by  modern  machines  that 
could  execute  the  NAS  software,  and  the  software  would  only  be  changed 


insofar  as  was  nacassacy  for  it  to  run  on  tha  naw  Bachinas.  This  option 
will  for  brevity  ba  referred  to  as  "rataosting*  or  "instructio'- eoapatibla 
raplacaaant.*  Tha  potential  advantages  that  are  aosMtiawa  elaiaad  for  this 
option  are  as  follows. 

a  Tha  aodarn  hardware  would  be  fast  enough  to  aliainata  any  capacity 
problaas. 

a  Tha  nodarn  hardware  would  ba  such  aora  reliable  than  tha  current 
hardware. 

a  Since  auch  of  tha  software  would  run  on  tha  naw  systaa  without 

change,  tha  tiaa,  aonay,  and  risk  involved  in  developing  naw  software 
could  ba  largely  avoided. 

a  Since  inatruction-coapatible  aachinas  are  available  off-the-shelf, 
this  option  could  ba  iaplaaantad  quickly  if  it  looks  like  capacity 
problaas  are  iaainant. 

This  report  will  critically  axaaina  these  claiaed  advantages  and  also 
search  for  any  disadvantages  that  aight  result  if  this  option  were  adopted. 
The  report  is  organized  in  the  following  way: 

a  Ch.  2:  Reliability  -  What  would  be  the  availability  of  the  new  system 
coapared  to  the  current  systea?  Bow  often  trould  failures 
occur  that  degraded  systea  perforaance?  What  would  be  the 
expected  duration  of  a  systea  failure? 

e  Ch.  3:  Perforaance  -  What  response  tiae  can  be  expected  froa  the  new 
systaa  coapared  to  tha  current  systea? 

e  Ch.  4:  Technical  Issues  -  Can  the  MAS  software  be  aade  to  run  on  a 
aodarn  aachine  with  a  reasonable  aaount  of  effort? 

e  Ch.  5:  Cost  -  What  extra  cost  would  this  option  entail? 
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•  Ch.  6:  Transition  -  Vfhat  problems  tiould  occur  during  the  transition 

to  the  new  system? 

•  Ch.  7:  Schedule  -  When  could  the  new  system  be  in  place  and  operating? 

e  Ch.  8:  Growth  Potential  -  Does  the  system  have  the  capability  to 
evolve  smoothly  as  technology  advances  and  as  there  are 
changes  in  the  services  that  ATC  provides? 

The  purpose  of  these  chapters  is  to  point  out  the  arguments  for  and  against 
the  rehosting  option  so  the  FAA  will  have  the  information  needed  to  decide 
whether  this  option  should  be  adopted. 

The  remainder  of  this  chapter  describes  the  current  computer 
configuration  at  the  ARTCC's  and  the  baseline  rehost  configuration. 

Variants  on  the  baseline  rehost  configuration  have  been  suggestedr  and  some 
of  them  are  mentioned  in  Sec.  1.4;  these  variants  are  not  discussed  in 
detail,  however,  since  it  is  expected  that  they  would  not  significantly 
alter  the  analysis. 

1.2  The  Current  Computet  Configuration 


The  computer  system  that  supports  the  NAS  at  each  ARTCC  has  two  parts. 
First,  the  central  coatouter  complex  (CCC)  receives  inputs  from  the  radar, 
flight  service  stations,  controllers,  and  other  sources  and  then  performs 
the  flight  data  processing  and  the  radar  data  processing.  In  other  words, 
the  CCC  takes  the  raw  information  and  converts  it  into  a  form  that  is  useful 
to  the  controller.  Second,  the  display  channel  takes  the  output  from  the 
COC  and  uses  it  to  keep  each  controller's  plan  view  display  (FVD)  current. 
The  CCC  and  display  channel  together,  then,  are  responsible  for  taking  the 
raw  data  about  what  is  happening  in  the  sky  and  providing  it  to  the 
controller  in  a  way  that  can  be  readily  grasped  and  acted  on.  Figure  1-1 
shows  a  block  diagram  of  the  current  system. 

The  CCC  at  tan  of  the  ARTCC's  use  an  IBM  9020A  system;  Figure  1-2  shows 
the  configuration  of  the  9020A  system.  Originally  all  the  CCC's  were  to 
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N.B.  Acronyms  ej^lained 
in  Sec.  1.3. 


FIGURE  1-1;  TOP-LEVEL  ARTCC  COMPUTER  OONPIGURATICM* 


have  been  9020A'Sr  but  it  was  feared  that  this  would  not  provide  enough 
capacity  for  the  busier  centers,  so  ten  ASTCC's  have  the  nore  powerful  IBM 
9020D  for  a  CCC;  Figure  1-3  shows  the  configuration  of  a  90200.  (The  FAA  is 
adding  another  storage  eleaent  (SB)  to  each  9020A  and  90200>  these 
additional  SB's  are  not  shown  in  Figures  1-2  and  1-3.) 

The  display  channel  at  fifteen  of  the  centers  is  a  Raytheon  730.  Since 
it  was  thought  that  the  Raytheon  730  would  not  provide  enough  capacity  for 
all  the  centers,  five  are  equipped  with  a  IBM  9020B'  in  the  display  channel. 

The  term  "9020*8''  will,  somewhat  inaccurately,  be  used  throughout  this 
report  to  refer  to  the  computers  in  the  CCC  and  the  display  channel.  Table 
1-1  shows  which  computers  are  in  place  at  each  ARTCC.  (Bn  route  ATC  is  also 
provided  at  three  sites  outside  the  continental  O.S.  These  sites  are  not 
considered  in  this  report  since  they  use  a  version  of  the  ARTS  system  rather 
than  the  9020  system.  The  FAA's  plan  is  that  these  sites  will  get  the  same 
equipment  as  the  other  ARTCC  sites  when  the  Advanced  Computer  System  is 
installed.) 

To  indicate  the  possible  problems  that  the  9020*8  face.  Table  1-2  shows 
the  judgments  made  by  one  study  as  to  where  there  are  bottlenecks  that 
potentially  limit  capacity.  I/O  bandwith,  I/O  device  speed,  and  memory 
capacity  are  seen  as  likely  bottlenecks  in  both  the  9020A  and  90200  systems; 
memory  bandwidth  and  processing  capacity  are  further  bottlenecks  in  the 
9020A  system.  This  quick  survey  serves  to  show  some  of  the  problems  that 
rehosting  must  be  able  to  deal  with.  (Further  Information  on  resource  usage 
is  in  (MIEL77a],  [NIBL77b] ,  and  (KAMD77] .) 


1.3  The  Baseline  Rehost  Configuration 

In  order  to  anal  .o  '.he  feasibility  of  rehosting  the  NAS  software  in  an 
instruction-compatible  computer  system,  a  generic,  rehost  computer  system 
has  been  configured.  This  generic  system  is  refered  to  as  the  baseline 
rehost  system  in  this  report  and  is  shown  in  a  block  diagram  format  in 
Figure  1-4  and  with  possible  components  in  Figure  1-5.  The  baseline  rehost 
system  is  representative  of  any  rahost  system  that  would  resolve  the  9020 's 
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FIGURE  1-3!  SIMPLIFIED  9020D  CONFIGURATION  DIAGRAM  tCLAP79] 

Si  -  Selector  Channel 
MXi  -  Multiplexor  Channel 
PAM  -  Peripheral  Adapter  Module 
CDC/OCC  -  Display  Channel 
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TABLE  I-l:  COMPOTBR  SYSTEM  CONriGORATlOMS  FOR  TBS  ARTCC'S 


C«nt»t  CCC  Display  Sac toes 


Albuquecqu* 

IBM  9020A 

Ray  730 

34 

Atlanta 

IBM  90200 

Ray  730 

41 

Boston 

IBM  9020A 

Bay  730 

32 

Chicago 

IBM  90200 

IBM  9020B 

43 

Clavaland 

IBM  90200 

IBM  9020B 

47 

Oanvac 

IBM  9020A 

Ray  730 

34 

Fort  North 

IBM  90200 

IBM  9020B 

39 

Houston 

IBM  9020A 

Ray  730 

41 

Indianapolis 

IBM  90200 

Ray  730 

34 

Jactcsonvllle 

IBM  90200 

Ray  730 

37 

Kansas  City 

IBM  90200 

Ray  730 

36 

Los  Angales 

IBM  90200 

Ray  730 

37 

Maopbis 

IBM  9020A 

Ray  730 

36 

Mlaal 

IBM  9020A 

Bay  730 

28 

Hlnnaapolls 

IBM  9020A 

Ray  730 

34 

Bav  York  City 

IBM  90200 

IBM  9020E 

39 

Oakland 

IBM  9020A 

Bay  730 

39 

Salt  Laka  City 

IBM  9020A 

Ray  730 

21 

Saattle 

IBM  9020A 

Ray  730 

22 

Washington  DC 

IBM  90200 

IBM  9020E 

36 
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TABLE  1-2:  TBS  CRITICAL  9020  RBSOORCBS 


la  this  resource 

a  bottleneck? 

Resource 

9020A 

9020D 

I/O  Bendwidtl) 

Yes 

Yes 

I/O  Device  Speed 

Yes 

Yes 

Memory  Capacity 

Yes 

Yes 

Neaory  Bandwidth 

Yes 

No 

Processing  Capacity 

Yes 

No 

Source:  [CLAP79,  p.  C-20] 
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N.B.  Acronyms  explained 
in  Sec.  1.3. 


FIGORE  1-4:  BASELINE  REHOST  CONFIGURATION 
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PICOIffi  1-5 ! 


DETAILED  BASELINE  REHOST  CXlNPIGOBATION 
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bottlanvcks  and  satisfy  ths  opscational  constraints  of  the  ARtCC 
facilities.  The  following  is  a  list  of  constraints  and  precepts  that  any 
rehost  coaputer  system  snist  satisfy: 


e  The  rehost  system  will  replace  both  the  CCC  (9020D  and  9020A  systems) 
and  the  display  channels  (Raytheon  730  and  9020B  systems) . 

e  There  will  be  two  mainframe  computers  in  the  rehost  configuration, 
and  they  will  be  used  in  a  duplex  mode  of  operation. 

e  Bach  mainframe  will  be  capable  of  supporting  all  of  the  current  CCC 
and  display  channel  processes. 

e  The  t«io  mainframes  will  be  connected  via  a  channel-to-channel  adapter. 

e  The  interface  to  the  controller  suites  will  be  at  the  display 
generators  and  the  keyboard  control  units. 

e  The  interface  to  the  radar  data  input  circuits  will  be  at  the  circuit 
terminations. 

e  The  interface  to  the  various  communications  circuits  will  be  at  the 
peripheral  adapter  module  (PAM) . 

e  The  interface  to  the  radar  keyboard  multiplexer  (RKM)  will  be  at  the 
data  adapter  unit  (DAO) . 

e  All  the  devices  local  to  the  processor  (for  example,  disk,  tape,  line 
printers,  operator  console,  and  terminals)  will  be  replaced. 

e  The  controller  suites,  the  flight  strip  printers  (PSP) ,  and  the 
non-radar  keyboard  multiplexors  (NRKM)  will  be  retained  without 
change. 

e  OARC  will  provide  an  independent  rader  data  channel  capability  for 
backup  purposes. 
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•  All  devices  will  be  connected  to  both  aainfraaes  and  have  two  paths 
to  each  uinfraM. 

e  Each  unique  device  type  will  be  represented  by  at  least  two  devices. 

These  general  properties  are  not  enough  to  define  the  rehost  systea 
c«Bpletely.  Therefore,  to  complete  the  characterization  of  the  baseline 
rehost  system,  this  report  adopts  the  following  assumptions. 

e  Each  mainframe  will  have  eight  megabytes  of  main  memory  with  a  growth 
potential  to  sixteen  megabytes. 

e  Each  mainframe  will  have  twelve  channels  that  can  operate  either  in 
multiplexor  or  block  multiplexor  mode.  These  channels  are  divided 
into  six  pairs,  where  one  channel  in  each  pair  is  redundant.  Figure 
1-4  shows  how  these  six  pairs  are  connected. 

e  The  normal  mode  of  operation  would  be  for  one  mainframe  to  support 
all  of  the  processes  while  the  other  is  maintained  in  a  "hot  standby* 
status. 

e  A  radar  input  line  multiplexor  (RIM/LM)  will  serve  each  radar  data 
input  circuit  and  block  valid  radar  data  into  records  for  processing 
by  the  mainframe.  RIN/LM  is  described  in  App.  D. 

e  A  display  buffer  will  be  located  between  the  mainframe  and  each 
display  generator  to  provide  the  necessary  display  fils  memory  for 
the  display  generators  and  avoid  memory  contention  problems  in  the 
mainframe.  The  display  buffer  is  described  in  App.  D. 

The  baseline  rehost  configuration  is  a  representative  configuration  rather 
than  the  only  or  best  configuration.  Permutations  to  this  configuration  are 
described  in  the  next  subsection.  General  operational  procedures  for  the 
rehost  system  are  described  after  the  discussion  of  permutations  to  the 
configuration. 
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Tha  cost,  transition,  and  achadula  analysis  assuaas  that  tha  rahoat 
systaa  is  daployad  at  tha  twanty  ARtCC'a  plus  tha  PAA  Tachnical  Cantar  (two 
systaas)  and  tha  FAA  Aaronautical  Cantar,  Cor  a  total  of  twantythraa 
systaas.  App.  C  discussas  tha  cost  saving  if  thara  is  partial  raplacaaent. 

1.4  Parautationa  to  tha  Baaalina  Hahost  Conf iauration 

Tha  basalina  rahoat  configuration  provides  a  basis  for  discussing  tha 
advantages  and  disadvantages  of  tha  concept  of  rahosting.  There  are, 
however,  aany  variants  on  tha  basalina  that  should  be  nantionad.  Tha 
leading  variants  and  sosm  of  their  features  are  as  follows. 

a  Raplaca  only  tha  OOC  coavonant  of  tha  current  cosiputer  systen  and 
retain  tha  currant  display  channels.  Thera  are  several  problems  with 
a  COC  only  raplacanant. 

*  Actual  axpariance  with  OARC  and  the  9020E  has  led  to  the 
conclusion  that  tha  (XX/display  channel  interface  is  more  complex 
than  tha  display  channal/display  generator  interface. 

*  A  display  channel  outage  would  result  in  a  system  outage  since  the 
currant  display  channels  are  redundant  on  a  coaiponent  basis  but 
not  on  a  unit  basis. 

*  Tha  queueing  delays  Cor  tha  Raytheon  730  display  channel  [NZEL77a] 
would  r amain  unresolved. 

*  Tha  ongoing  maintenance  costs  for  the  current  display  channels 
would  exceed  tha  incremental  cost  of  replacing  tha  display 
channels  as  part  of  tha  CCC  raplacaaent. 

a  Retain  tha  currant  2314  disks  to  avoid  any  embedded  channel  program 
problems  that  could  arise  with  new  disks  and  their  associated  device 
support  routines.  Several  benefits  of  current  technology  disks  %iould 
not  be  available  to  the  rehost  systeni 
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*  shorter  access  and  latency  time  than  those  for  2314  disks 


+  higher  data  transfer  rates  than  those  for  2314  disks, 

*  better  reliability  characteristics  than  those  for  2314  disks, 
larger  storage  capacity  than  that  for  2314  disks. 

e  Replace  the  RAM's  and  DAO's  with  newer  technology  control  units  and 
line  controllers.  The  engineering  costs  for  redeveloping  all  of  the 
necessary  line  controllers  and  the  problems  associated  with  the 
physical  transition  to  the  replacement  RAM's  and  DAO's  must  be 
weighed  against  current  maintenance  coats  for  these  units  and  the 
need  for  flexibility. 

a  Consider  supporting  RIM  in  the  mainframe.  RIM  is  currently  supported 
with  an  open  loop  channel  program  in  the  IOCS.  This  sort  of  channel 
program  would  not  be  viable  in  a  mainframe.  However,  the  other 
alternative  of  interrupt  driven  radar  data  input  for  RIM  would 
destroy  the  performance  of  the  mainframe. 

A  programmatic  variant  to  the  baseline  rehost  configuration  is  to  deploy 
the  rehost  system  at  only  the  overloaded  ARTCC's.  This  partial  deployment 
would  reduce  th«  rehost  hardware  procurement  costs  but  would  require 
logistics  and  maintenance  support  for  two  very  different  systems.  In 
addition,  a  partial  deployment  of  the  rehost  system  would  make  more 
difficult  an  orderly  evolution  of  the  ATC  functions. 

1.5  Operation  of  the  Baseline  Rehoat  Computer  System 

In  order  to  manage  the  resources  of  thm  mainframe  and  to  support  local 
data  prccessing  activities,  it  is  expected  that  a  virtual  machine 
environment  will  be  provided  in  the  rehost  mainframes.  VM/370  represents  a 
viable  virtual  machine  monitor  for  this  application.  Another  option  is  use 
the  kernel  of  VM/370  as  a  basis  for  developing  a  virtual  machine  monitor 
unique  to  the  needs  of  rehosting  the  MAS  software  and  providing  a  virtual 
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envlconaant  Cor  supporting  ths  local  data  processing  activities  as  well  as 
the  evolving  ATC  functions.  Whatever  aonitor  is  used  for  the  mainframe,  it 
must  support: 

•  the  application  component  of  the  HAS  software  without  revision  and 
with  a  minimum  of  monitor  overhead: 

e  the  local  utility  programs  for  data  analysis,  report  generation, 
adaptation  assemblies,  and  system  generation. 

There  are  two  possible  modes  for  operating  the  rehosted  NAS  software 
since  the  baseline  rehost  configuration  will  have  two  mainframes  to  satisfy 
availability  requirements  and  each  mainframe  will  have  sufficient  processor 
capacity  to  support  both  the  CCC  processes  and  the  display  channel 
processes.  One  mode  of  operation  is  to  designate  one  mainframe  as  the 
"active*  processor  and  the  other  as  the  ‘standby*  processor  with  automated 
support  for  transferring  active  status  in  the  event  of  a  failure.  The  other 
mode  of  operation  is  to  assign  the  CCC  processes  to  one  processor  and  the 
display  channel  processes  to  the  other  processor.  In  this  split  mode  of 
operation,  a  failure  of  one  processor  would  result  in  the  transfer  of  all 
processes  to  the  operational  processor.  The  analysis  is  based  on  the  first 
mode  of  operation  because  it  will  simplify  the  backup  procedures. 
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2.  RELIABILITY 


2.1  Purpose  and  Organization  of  this  Chapter 

The  reliability  of  an  air  traffic  control  computer  system  is  an 
important  criterion  in  deciding  whether  it  should  be  procured  since  system 
outages  can  cause  delay  and  cancellation  of  flights  as  well  as  a  decreased 
level  of  safety  and  an  increased  workload  on  the  controllers.  The  purpose 
of  this  chapter  is  to  discuss  the  reliability  of  the  rehost  system  compared 
to  that  of  the  9020  system. 

The  two  main  questions  of  interest  that  this  chapter  focuses  on  are: 

How  often  does  a  system  failure  occur?  How  long  will  it  last?  In 
considering  these  questions  it  is  important  to  remember  that  the  failure  of 
a  single  unit  of  hardware  does  not  cause  a  system  failure.  This  is  because 
both  the  9020  and  rehost  systems  have  redundant  hardware  and  the  capability 
to  automatically  reconfigure  the  system  so  that  the  interruption  in  the 
operation  of  the  computer  system  is  only  a  matter  of  seconds  when  an 
Individual  unit  fails,  for  example,  the  90200  has  three  compute  elements 
(CE's);  under  normal  operation  two  are  active  and  one  is  redundant.  If  one 
of  the  active  CE's  fails,  then  the  90200  automatically  reconfigures  so  that 
the  redundant  CE  is  made  active;  this  process  takes  perhaps  25*'30  seconds. 
This  means  that  even  if  one  component  fails,  there  is  no  system  failure, 
i.e.,  no  significant  interruption  in  service,  because  of  the  redundancy 
built  into  the  system.  In  this  example  there  would  only  be  a  system  failure 
if,  before  the  first  failed  CE  trere  repaired  or  replaced,  a  second  CE 
failed.  This  shows  how  redundancy  can  lessen  but  not  eliminate  system 
failures. 

This  chapter  is  organized  as  follows.  Sec.  2.2  shows  how  the  system 
availability,  system  mean  time  between  failure  (HTBF) ,  and  average  duration 
of  a  system  failure  can  be  estimated  from  information  on  the  HTBF  of 
individual  units,  the  mean  time  to  repair  (NTTR)  individual  units,  and  the 
configuration  of  the  system.  The  system  availability,  system  HTBF,  and  the 
expected  duration  of  the  system  outage  are  estimated  for  the  9020D/9020E 
system  and  the  rehost  system.  A  sensitivity  analysis  is  carried  out  that 
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shows  how  th«s«  estimates  vary  if  alternate  assumptions  are  used.  This 
analysis  only  considers  hardware  and  does  not  discuss  software  reliability. 
Sec.  2.3  looks  at  hardware  reliability  from  another  angle  by  assuming  that 
no  repairs  are  made;  system  MTBF's  are  estimated,  and  a  sensitivity  analysis 
is  carried  out. 

Sec.  2.4  gives  a  qualitative  discussion  of  the  software  reliability  that 
could  be  expected  from  the  rehost  system.  Sec.  2.5,  in  a  tentative, 
quantitative  analysis,  then  goes  on  to  make  numerical  assumptions  about 
software  reliability,  and  estimates  the  availability,  tRBF,  and  expected 
duration  of  an  outage  for  the  system  as  a  whole  that  takes  into  account  both 
hardware  and  software. 

Finally,  Sec.  2.6  discusses  the  failures  that  result  from  miscellaneous 
problems  such  as  errors  by  human  operators  and  technicians. 

A  cautionary  note  should  be  sounded  about  the  theoretical  nature  of  the 
results  reported  in  this  chapter.  the  assumptions  made  about  component 
MTBF's  and  repair  time  are  correct,  then  the  results  in  this  chapter  are 
valid.  The  data  that  is  available  to  check  these  assumptions,  however,  is 
incomplete;  therefore,  there  is  doubt  about  the  accuracy  of  the 
assumptions.  Moreover,  it  would  have  been  desirable  to  validate  the  model 
by  checking  the  results  against  the  measured  MTBF's  and  availability  of  the 
9020 's,  but  the  available  data  were  too  incomplete  to  allow  this.  For  these 
reasons  the  reader  should  reserve  judgment  on  the  accuracy  of  this  chapter's 
results.  Like  the  miles  per  gallon  figures  featured  in  automobile 
advertisements,  these  results  are  to  be  used  only  for  purposes  of  comparison. 

Some  terminology  is  needed.  Availability  is  the  amount  of  time  a  system 
is  working  divided  by  the  sum  of  the  time  the  system  is  working  and  the  time 
the  system  is  not  working.  Equivalently,  availability  is  the  probability 
that  the  system  is  wrking  at  a  randomly  chosen  point  in  time.  The  terms 
operating,  working,  up,  and  not  failed  are  used  synonymously.  It  is  assumed 
that  a  unit  or  a  system  is  either  failed  or  not  failed;  no  Intermediate 
stage  of  partial  failure  is  considered.  The  terns  component,  unit,  element, 
and  device  are  used  synonymously  to  refer  to  a  single  storage  element  (SB) , 
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coapute  eleaent  (CE) ,  input/output  coaputa  alaacnt  (lOCB) ,  tape  control  unit 
(TCD) ,  or  storage  control  unit  (SCC,  i.e.«  disk).  Ml  of  the  like  units  are 
referred  to  as  a  aubsystea«  e.g.«  the  three  CE's  in  a  9020D  are  the  CE 

subsystea. 

2.2  Bardware  Reliability 


2.2.1  Introduction 

This  section,  which  considers  hardware  only,  presents  estiaates  of  the 
systea  availability,  systoi  MTBF,  and  the  expected  duration  of  a  systea 
outage  for  the  rehost  systea  and  for  a  systea  with  a  9020D  in  the  CCC  and  a 
9020E  in  the  display  channel.  The  exposition  proceeds  in  four  steps. 

e  Develop  a  theoretical  model  which  expresses  system  availability, 
systea  MTBF,  and  the  expected  duration  of  a  system  outage  as  a 
function  of  the  unit  MTBF's  and  unit  mean  time  to  repair. 

e  Determine  the  NTBF's  and  NTTK's  for  each  component. 

e  Substitute  these  HTBP's  and  HTTR's  into  the  model  to  obtain  estiaates 
of  system  availability,  systoa  HTBF,  and  the  expected  duration  of  a 
systea  outage. 

•  Carry  out  a  sensitivity  analysis  to  determine  how  the  results  vary  if 
alternate  assumptions  are  used. 

Each  of  these  steps  is  now  discussed. 

2.2.2  The  Model  of  Systea  Availability  and  Systea  HTBF 

App.  A  presents  a  detailed  derivation  of  the  equations  and  methods  used 
to  estimate  system  availability  and  MTBF.  This  subsection  describes  this 
analysis.  It  has  three  main  steps. 


First,  determine  the  configuration  of  each  system  to  be  modeled.  The 
two  systems  modeled  are  the  rehost  system  and  a  system  with  a  9020D  in  the 


CCC  and  a  9020E  in  the  display  channel.  (Information  on  the  Raytheon  730 
was  not  sufficient  to  allow  it  to  be  modeled.)  Consider  the  rehost  system. 
Define  a  mainframe  to  consist  of  a  CPO«  a  memory,  and  six  pairs  of 
channels.  R  mainframe  is  working  if  the  following  three  conditions  all  hold: 

e  the  CPU  is  working: 

e  the  memory  is  working: 

e  at  least  1  channel  in  each  pair  is  working. 

The  rehost  system  is  working  if  the  following  three  conditions  all  hold: 

e  at  least  1  of  the  2  mainframes  is  working: 
e  at  least  1  of  the  2  TCU's  is  working: 

e  at  least  1  of  the  2  SCO's  is  forking. 

For  the  90200,  the  system  is  working  if  the  following  five  conditions  all 
hold: 

e  at  least  2  of  the  3  CB's  are  working: 

e  at  least  S  of  the  6  SB's  are  working: 

e  at  least  2  of  the  3  lOCB's  are  working: 

e  at  least  2  of  the  3  TCU's  are  «rorking: 

e  at  least  2  of  the  3  SCO's  are  working. 

It  is  assumed  that  for  this  analysis  the  90200  and  9020B  are  equivalent. 

Secoiuj,  derive  the  equations  that  express  system  reliability.  For  a 
single  unit  the  equation  for  its  availability  is 

mBP 

X  _ _ a _ 

u  (WBF  +  MPPR  ■ 
u  u 

From  the  availability  of  a  single  unit,  an  equation  is  derived  that  states 
the  availability  of  a  subsystem,  e.g.,  the  probability  that  at  least  2  of 
the  3  90200  CB's  are  working  at  a  randomly  chosen  point  in  time.  Froei  the 
availability  of  the  subsystems,  the  availability  of  the  complete  system  is 
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derived,  i.e.,  the  probability  that  the  system  is  working  at  a  randomly 
chosen  point  in  time. 

Third,  the  system  (fTBF  is  derived.  If  the  equation  above  is  interpreted 
to  apply  to  the  system  instead  of  just  a  unit  and  is  solved  for  NTBF,  one 
gets 

A 

MTBP  «  — —  MTTR  . 

®  1-A  * 

s 

Since  system  availability  A^  was  calculated  in  the  second  step,  system 
MTBF  can  be  estimated  once  the  system  MITR  is  calculated.  The  system  MTTR, 
which  is  the  same  thing  as  the  expected  duration  of  a  system  outage,  is 
estimated  with  a  special  calculation.  This  cosipletes  the  outline  of  how 
information  about  unit  MTBF's,  unit  MTTR's,  and  the  configuration  can  be 
combined  to  estimate  the  system  availability,  MTBF,  and  MTTR.  (It  is 
important  to  note  that  the  variables  in  the  first  equation  refer  to  a  single 
unit  while  those  in  the  second  refer  to  the  entire  system.) 

In  summary,  the  main  assumptions  used  in  this  derivation  are: 

e  all  failures  are  probabilistically  independent; 

e  the  MTBF  for  each  unit  is  finite; 

e  all  repair  times  are  independent  and  exponentially  distributed; 

e  when  an  active  unit  fails  and  is  replaced  by  a  redundant  unit,  the 
reconfiguration  is  instantaneous; 

e  a  system  failure  only  occurs  under  the  conditions  spelled  out  above; 

e  no  more  than  one  of  these  conditions  is  violated  at  any  one  time. 

There  are  three  ways  in  which  these  assumptions  are  not  exact.  First, 
repair  times  are  not  independent  since  the  time  it  takes  to  repair  one  unit 
is  affected  by  whether,  when  it  fails,  there  are  any  other  units  being 
repaired.  Second,  reconfiguration  after  a  unit  failure  is  not 
instantaneous;  in  the  9020D  it  takes  perhaps  25-30  seconds.  Third,  it  is 
possible  that  two  of  the  conditions  could  be  simultaneously  violated;  this 
is,  however,  an  extremely  unlikely  event.  Hone  of  these  three  simplifying 
assumptions  has  a  significant  effect  on  the  results. 
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2.2.3  MTBP  and  MTTR  Data 


In  ordar  to  estimate  system  availability  and  system  MTBF,  the  model  just 
descclbed  requires  data  on  the  MIBF  and  MTTR  for  each  component.  For  the 
rehost  system#  MTBF's  are  taken  from  (ROTLSl] .  The  original  source  of  this 
data  is  Reliability  Research#  Inc.#  which  gathers  data  on  the  actual 
reliability  of  equipment  operated  by  a  variety  of  users.  The  MTBF's  refer 
to  machines  comparable  to  those  that  «rould  probably  be  used  in  a  rehost 
system.  The  MTBF's  used  are  shown  in  Table  2-1.  The  column  labeled  "Best 
Estimate"  are  taken  from  [RUTL81] .  Low  and  high  estimates#  which  are  needed 
for  the  sensitivity  analysis#  are  also  shown;  the  low  and  high  estimates 
are#  respectively#  half  and  twice  the  best  estimate. 

For  the  9020D  system#  all  MTBF's  except  that  for  the  SCU  were  taken  from 
[MOSS75#  p.  25]  «#hich  reports  failure  data  on  the  9020D  at  the  FAA  Technical 
Center  over  the  one-year  period  starting  October  1#  1971.  For  the  CE,  SB, 
and  lOCE  this  study  computes  the  lower  and  upper  bounds  of  the  95  percent 
confidence  interval;  these  bounds  are  displayed  in  Table  2-1  along  with  the 
best  estimate.  Mo  interval  was  computed  for  the  TCU  since  no  failures  were 
observed;  Table  2-1  uses  for  the  low  and  high  estimates#  respectively#  half 
and  twice  the  number  given  in  [NOSS75] .  The  MIBF  for  the  SCU  is  25#358 
hours  in  [M0SS7S] ;  this  is  not  a  representative  figure  since  disks  were  just 
being  introduced  during  the  period  of  observation  and  were  not  fully 
utilised.  Therefore#  the  MTBF  for  the  SCU  is  taken  from  [RUTL81] .  [RUTL81] 
is  not  a  good  source  for  the  other  9020D  MTBF's  because#  while  figures  for 
Systsm/360  components  are  given#  they  are  based  on  extremely  small  samples. 

In  sum#  the  MTBF's  used  in  this  study  are  shown  in  Table  2-1.  These  are 
the  best  figures  that  could  be  obtained#  but  it  should  be  stressed  that 
there  are  real  doubts  as  to  the  accuracy  of  these  figures.  For  example# 
this  table  shows  that  a  90200  CE  has  a  higher  HTBF  than  a  modern  CPU;  this 
is  counter  to  the  generally  accepted  opinion  that  modern  technology  is  much 
mors  reliable  than  System/360  technology.  A  sensitivity  analysis  using 
alternate  MTBF's  is  carried  out  to  try  to  minimize  this  problem#  but  a 
sensitivity  analysis  is  not  a  substitute  for  good  data.  Therefore#  because 
non-comparable  and  perhaps  inaccurate  data  are  used#  one  should  reserve 
judgment  on  the  accuracy  of  the  results  reported. 
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TABLE  2-1:  MTBP's  USED  IH  THIS  STODY 


wrar  (hours) 


90200  CoBDonants 

Low 

Bast  Bstiaata 

High 

CE 

1,391 

2,301 

4,116 

SE  (1/2  MB) 

2,089 

3,173 

5,052 

lOCE 

1,750 

3,161 

6,354 

TCO 

4,241 

8,482 

16,964 

SCO 

350 

700 

1,400 

Rehost  COBOonents 

CPO 

678 

1,356 

2,712 

Maaocy  (2MB)* 

1,147 

2,293 

4,586 

Channel 

658 

1,316 

2,632 

Tape 

500 

1,000 

2,000 

Disk 

3,582 

7,163 

14,326 

*  Sine*  aach  cahost  aainfraM  has  8  aagabytaa  of  aain  Baaocyr  tha  HTBF'a 
uaad  in  tha  calculations  aca  ona-fourth  of  tha  figuras  shown  in  this 
tabla. 

Soucca:  (M0SS7S«  p.  25]  and  [B0TL81] 
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Th*  tlM  to  copair  aach  conponant  la  assuaad  to  ba  exponentially 
distributed  with  a  aaeui  of  one  hour.  This  assuaption  was  chosen  after 
talking  to  Airway  Facilities  Service  personnel  and  after  exaaining  the  data 
in  {MOSSTSf  p.  25].  Since  the  data  on  repair  times  is  scanty,  this 
assuaption  should  be  treated  as  tentative.  It  is  also  assumed  that  the 
repair  time  for  any  unit  is  independent  of  whether  any  other  units  ace 
failed.  In  effect,  this  assumption  means  that  failed  units  need  not  queue 
up  waiting  for  a  repairman. 

2.2.4  Estimates  of  System  Availability  and  System  MIBF 

When  the  beat  estimates  of  component  MTBF's  ace  substituted  into  the 
model's  equations,  the  estimates  of  system  availability  and  system  HTBF 
shown  in  Table  2-2  result.  The  mean  time  between  system  failures  for  the 
cehost  system  is  2905  days,  which  is  about  2  1/2  times  the  MTBF  of  1226  days 
for  the  90200/9020E  system.  Therefore,  in  this  primary  calculation  the 
cehost  system  is  substantially  more  reliable  than  the  9020D/9020E  system. 

For  both  systems  the  expected  duration  of  a  system  outage  is  a  half  hour. 

The  main  reason  why  the  cehost  system  has  a  higher  tfTBF  lies  in  its 

configuration!  the  two  rehost  mainframes  ace  configured  in  parallel  whereas 

the  CCC  and  the  display  channel  in  a  9020  system  ace  configured  in  secies. 

That  is,  if  one  cehost  mainframe  fails,  this  does  not  cause  a  system 

failure.  But  if  either  the  90200  or  the  9020E  fails,  this  does  cause  a 

system  failure.  A  numerical  example  will  bring  out  the  importance  u€  this 

consideration.  Suppose  that  the  probability  that  a  particular  mainframe  is 

working  is  0.5.  Then  the  probability  that  at  least  one  mainframe  is  working 
2 

is  1  -  (0.5)  ■  0.75.  How  consider  the  9020D/B  system,  and  suppose  that 

the  probability  of  the  90200  working  is  0.5,  and  that  the  probability  of  the 
9020B  working  is  also  0.5.  Both  must  work  to  prevent  a  system  failure,  and 

TABLE  2-2:  SYSTEM  AVAILABILITY  ANO  SYSTEM  NTSF:  PRIMARY  CALCDLATIOM 

System  System  Availability  System  MTBF 

90200/9020B  0.99998301  1226  days 

Rehost  0.99999283  2905  days 
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2 

the  probability  oC  both  working  is  then  (0.5)  ■  0.25.  This  exaaple 

indicates  how  the  parallel  configuration  of  the  rehost  systes  increases 

aystea  availability.  ^ 

» 

i 

Another  factor  that  should  increase  the  relative  reliability  of  the 
rehost  systea  is  that  it  eabodies  modern  technology,  which  is  much  more 

reliable  than  the  technology  embodied  in  the  9020 's.  For  exaaple.  when  an 

Instruction  fails  to  execute  in  a  9020.  there  is  a  machine  check.  In 
contrast,  when  an  instruction  fails  to  execute  in  the  rehost  systea.  the 
machine  check  is  held  pending  and  the  instruction  is  retried;  there  is  a 

machine  check  only  if  the  instruction  fails  to  execute  twice.  As  Table  2>1 

shows,  this  advantage  of  the  rehost  systea  is  not  fully  reflected  in  data 
used  in  this  study.  Because  of  this  apparent  flaw  in  the  data,  the  reported 
results  probably  understate  the  reliability  of  the  rehost  system  compared  to 
the  9020D/9020E  system. 

\ 

2.2.5  Sensitivity  Analysis 

i 

To  show  how  the  results  given  in  Subsec.  2.2.4  are  affected  by 
variations  in  the  data,  the  results  of  a  sensitivity  analysis  will  now  be 
presented.  Two  types  of  variations  are  considered.  First,  to  recognize  the 
uncertainty  in  the  unit  MTBF's.  the  calculation  is  carried  out  not  only  for 
the  baseline  MTBF's  but  also  for  the  low  and  high  unit  KTBF's  shown  in  Table 
2-1.  Second,  to  recognize  the  uncertainty  in  the  unit  MTTR.  the  calculation 
is  repeated  using  not  only  the  baseline  MTTR  of  1  hour  but  also  the 
alternate  values  of  1/2  and  2  hours. 

The  results  for  the  90200/9020B  system  are  shown  in  Table  2-3  under  39 
sets  of  assumptions.  The  first  line  labeled  "Baseline”  used  the  baseline 
unit  MTBF's  from  Table  2-1.  The  second  line  labeled  "High  CE"  used  the  high 
CE  MTBF  from  Table  2-1;  the  baseline  MTBF's  are  used  for  the  remaining 
units.  The  rest  of  the  cases,  with  the  exception  of  the  last  two.  similarly 
use  the  baseline  MTBF's  for  all  but  one  component;  the  table  shows  for  which 
component  an  alternate  MTBF  is  used  and  whether  the  alternate  MTBF  is  the 
high  or  low  value.  The  next  to  last  line  uses  the  high  MTBF's  for  all  the 
components;  the  last  line  used  the  low  MTBF's  for  all  the  components. 
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TABLE  2-3:  90200/9020B  SYSTEM  AVAILABILITY  AMD  MTBF:  SENSITIVITY  ANALYSIS 


Coaponant 

MTBF's 

MTTR  - 

1/2 

MTTR  - 

1 

MTTR  ■ 

2 

Avail. 

MTBF 

Avail. 

WTBP 

Avail. 

MTBF 

Baseline 

0.99999575 

2449 

0.99998301 

1226 

0.99993226 

615 

High  CE 

0.99999594 

2566 

0.99998379 

1285 

0.99993536 

645 

Low  CE 

0.99999525 

2195 

0.99998105 

1099 

0.99992442 

551 

High  SE 

0.99999619 

2739 

0.99998481 

1372 

0.99993945 

688 

Low  SE 

0.99999477 

1993 

0.99997913 

998 

0.99991676 

501 

High  IOCS 

0.99999586 

2516 

0.99998346 

1260 

0.99993406 

632 

Low  lOCE 

0.99999541 

2268 

0.99998165 

1136 

0.99992684 

569 

High  TCD 

0.99999576 

2458 

0.99998307 

1231 

0.99993251 

617 

Low  TCO 

0.99999568 

2413 

0.99998276 

1208 

0.99993126 

606 

High  SCO 

0.99999804 

5306 

0.99999215 

2655 

0.99996866 

1330 

Low  SCO 

0.99998660 

778 

0.99994660 

390 

0.99978791 

196 

High  All 

0.99999881 

8760 

0.99999525 

4383 

0.99998102 

2195 

Low  All 

0.99998474 

682 

0.99993915 

342 

0.99975818 

172 

N.B.  HTBE  is 

in  days. 
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Foe  each  case  the  results  ace  shown  for  the  three  assumptions  for  the 
HITR  of  1/2,  1,  and  2  hours.  For  exai^le,  this  table  shows  that  if  the  high 

value  for  the  lOCE  MTBF  of  6,354  hours  is  used,  and  the  baseline  MTBF's  are  i 

used  for  the  remaining  components,  then  the  system  MTBF  is  2516,  1260,  or 
632  days  depending  on  whether  the  assumed  MTTR  is  1/2,  1,  or  2  hours.  Table 
2-4  shows  the  results  for  the  cehost  system  under  39  sets  of  assumptions; 
this  table  is  read  the  same  way  as  Table  2-3.  Five  conclusions  can  be  dratm 
from  the  sensitivity  analysis. 

First,  the  system  MTBF*s  ace  very  close  to  being  inversely  proportional 
to  the  unit  HTTR.  This  holds  for  both  the  9020D/9020E  system  and  the  rehost 
system.  For  example,  for  the  90200/9020E  system,  with  the  baseline  MTBF's 
and  an  MTTR  of  1/2  hour,  the  system  MTBF  is  2449  days;  when  the  MTTR  is 
doubled  to  1  hour,  the  MTBF  is  nearly  halved  to  1226  days;  when  the  MTTR  is 
again  doubled  to  2  hours,  the  MTBF  again  is  nearly  halved  to  615  days. 

Second,  results  for  the  90200/9020E  system  ace  very  sensitive  to  the  SCU 
MTBF  but  relatively  insensitive  to  the  other  unit  MTBF's.  For  example, 
consider  the  case  where  the  MTTR  is  1  hour.  Under  the  baseline  unit  MTBF's, 
the  90200/9020E  system  MTBF  is  1226  days.  If  the  SCU  MTBF  is  then  raised 
from  its  baseline  value  of  700  hours  to  1400  hours,  the  system  MTBF  rises  to 
2655  days,  which  is  almost  as  high  as  the  rehost  system  MTBF  of  2905  days 
under  the  baseline  assumptions.  If  the  SCU  MTBF  is  instead  lowered  from  700 
to  350  hours,  then  the  9020D/9020E  system  MTBF  falls  to  390  days.  No  such 
wide  swings  occur  when  the  other  unit  MTBF's  ace  varied;  in  the  other  cases 
in  which  a  single  unit  MTBF  is  changed,  the  9020D/9020E  system  MTBF  falls 
into  the  interval  from  998  to  1372  days. 

Third,  the  rehost  system  MTBF  is  relatively  insensitive  to  changes  in 
the  MTBF's  of  the  channels,  TCU's,  and  SCU's.  The  explanation  is  that  these 
units  have  such  high  MTBF's  that  they  almost  never  cause. a  system  failure; 
this  remains  true  even  after  the  unit  MTBF's  have  been  halved  or  doubled. 

Fourth,  the  cehost  system  MTBF  is  very  sensitive  to  the  unit  MTBF's  of 
the  CPU  and  the  memory,  especially  the  latter.  For  example,  if  the  memory 
MTBF  is  halved  from  2,293  to  1,147  hours,  the  rehost  system  MTBF  falls  to 
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TABLE  2-4:  BBHOST  SYSTEM  AVAILABILITY  AMD  MTBP:  SENSITIVITY  ANALYSIS 


Coaponant 


MTBP'S 

MPTR  « 

1/2 

MTTR  - 

1 

MTTR  - 

2 

Avail. 

MTBP 

Avail. 

MTBP 

Avail. 

MTBP 

Baseline 

0.99999821 

5807 

0.99999283 

2905 

0.99997136 

1455 

High  CPU 

0.99999863 

7599 

0.99999452 

3801 

0.99997809 

1902 

Low  CPU 

0.99999716 

3665 

0.99998865 

1835 

0.99995470 

920 

High  Mea»cy 

0.99999909 

11536 

0.99999639 

5766 

0.99998554 

2880 

Low  Mesncy 

0.99999529 

2214 

0.99998123 

1110 

0.99992533 

558 

High  Channel 

0.99999821 

5813 

0.99999284 

2911 

0.99997146 

1460 

Low  Channel 

0.99999819 

5787 

0.99999278 

2885 

0.99997095 

1434 

High  TCO 

0.99999839 

6484 

0.99999358 

3244 

0.99997434 

1624 

Low  TCO 

0.99999746 

4098 

0.99998984 

2051 

0.99995947 

1028 

High  SCO 

0.99999821 

5819 

0.99999284 

2911 

0.99997141 

1458 

Low  SCU 

0.99999819 

5760 

0.99999277 

2882 

0.99997112 

1443 

High  All 

0.99999955 

23222 

0.99999821 

11615 

0.99999283 

5811 

Low  All 

0.99999283 

1454 

0.99997137 

728 

0.99988577 

365 

N.B.  MTBF  is  In  days. 
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1110  days,  which  is  less  than  the  90200/9020B  system  MTBF  of  1226  days  under 
the  baseline  assumptions. 


Fifth,  to  summarize,  if  the  same  MTTR  is  assumed  for  both  systems,  then 
the  rshost  system's  highter  MTBF  in  the  baseline  case  is  maintained 
throughout  most  of  the  cases  examined.  The  conclusion  is  that  the  rehost 
system's  higher  MTBF  is  not  highly  sensitiwe  to  the  assumptions  used  here. 

It  should  be  pointed  out,  however,  that  the  rehost  system's  lead  in  system 
MTBF  can  be  reduced  and  even  lost  if  one  picks  and  chooses  from  among  the 
cases  so  that  the  rehost  system  is  put  in  the  worst  light  and  the 
90200/90 20B  system  in  the  best  light. 

2.3  Hardware  Reliability;  The  No  Repairs  Case 

One  of  the  uncertainties  in  the  analysis  of  the  previous  section  is 
doubt  over  what  the  repair  time  would  be.  The  sensitivity  analysis  that 
assumed  various  mean  repair  times  is  one  way  of  dealing  with  this  doubt. 
Another  way  of  dealing  with  it  is  to  assume  that  repairs  are  not  made;  that 
is,  the  system  runs  until  enough  unit  failures  have  accumulated  to  cause  a 
system  failure.  This  approach  is  taken  in  this  section.  This  approach  is 
flawed  because  the  assumption  that  repairs  are  not  made  is  incorrect,  but  it 
does  give  one  a  way  of  comparing  the  different  systems  on  a  common  basis 
which  is  uncontaminated  by  a  possibly  inaccurate  assumption  about  repair 
time. 


The  reliability  of  a  system  R(t)  is  defined  to  be  the  probability  that 
after  t  hours  of  operation  there  has  not  been  a  system  failure.  App.  B 
derives  equations  that  allow  the  reliability  function  R(t)  to  be  derived 
once  the  MTBF's  of  the  Individual  units  are  known.  The  main  assumption  is 
that  the  failure  time  for  each  unit  is  exponentially  distributed.  The 
reliability  function,  once  it  is  obtained,  can  then  be  used  to  estimate  the 
system  MTBF,  i.e.,  the  number  of  hours  the  system  is  expected  to  operate 
before  a  system  failure  halts  operation.  (The  systm  MTBF  thus  estimated  is 
not  exact;  it  is  approximated  by  a  procedure  explained  in  App.  B.) 
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Using  ths  best  estimates  of  the  unit  NTBF's  in  Table  2-1,  it  is  found 
that  the  rehost  systos  has  an  MTBF  of  400  hours,  and  the  9020D/9020E  system 
has  an  MTBF  of  300  hours.  Therefore,  in  this  analysis  the  rehost  system 
retains  its  lead  in  reliability. 

A  sensitivity  analysis  is  also  carried  out  for  this  no  repair  approach, 
and  the  results  are  shown  in  Tables  2-S  and  2-6.  The  tables  show  the  system 
MTBF  in  hours  for  a  variety  of  cases.  The  baseline  case  uses  the  best 
estimates  of  Table  2-1.  Each  succeeding  case  indicates  the  individual 
component  for  which  the  MTBF  is  varied,  and  high  or  low  tells  which  of  the 
alternate  KTBF's  from  Table  2-1  is  used.  For  example,  in  Table  2-5  the 
‘High  CE*  case  means  that  the  high  CE  MTBF  of  4116  hours  from  Table  2-1  is 
used}  all  other  MTBF's  are  the  baseline  MTBF's.  That  is,  except  for  the 
cases  mar ted  "All"  at  the  bottom  of  the  tables,  only  one  unit's  MTBF  is 
changed  for  each  calculation.  For  the  cases  marked  "All,"  the  MTBF’s  of 
every  component  are  changed  for  the  calculation.  Examination  of  these 
tables  shows  that,  for  the  most  part,  the  rehost  system  maintains  its  edge 
in  reliability;  it  is,  however,  possible  to  pick  cases  in  which  the  90200/E 
system  has  a  higher  system  MTBF. 

2.4  Software  Reliability 

The  reliability  of  software  is  an  important  aspect  of  any  computer 
system  since  the  hardware  and  software  are  combined  in  a  serial  manner  to 
support  every  application.  That  is,  failure  in  either  the  hardware  or  the 
software  will  result  in  a  system  failure.  This  means  that  perfect  hardware 
alone  cannot  overcome  software  defects  and  conversely.  Before  proceeding  to 
a  quantitative  analysis  in  Sec.  2.5,  this  section  will  give  a  qualitative 
discussion  of  what  the  reliability  of  the  rehost  system  software  is  expected 
to  be  compared  to  the  current  software. 

In  the  rehost  system,  there  are  three  major  software  components: 

e  the  HAS  application  software, 
e  the  HAS  monitor,  and 
e  the  virtual  machine  monitor  (VM() . 
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TABLE  2-5:  9020D/9020E  SYSTEM  MTBT'S  KITBOOT  REPAIKS 


Ca5« _  9020D/9020E  WrBF(houci) 


Basal Ine 

300 

High  CE 

310 

Low  CE 

280 

High  SB 

320 

Low  SE 

270 

High  lOCE 

310 

Low  lOCE 

290 

High  TOI 

300 

Low  TOO 

300 

High  SOI 

420 

Low  SCO 

180 

High  All 

560 

LOW -All 

160 

TABLE  2-6:  REHOST  SYSTEM  MTBF'S  HITHOQT  REPAIRS 


Casa 

Rehost  MTBF 

Basalina 

400 

High  CPO 

440 

Low  CPU 

350 

High  Naaocy 

490 

Low  Naaocy 

300 

High  Channal 

490 

Low  Channal 

280 

High  TCO 

420 

Low  TCO 

360 

High  SCO 

400 

LOW  SCO 

400 

LOW  All 

200 
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The  MAS  application  software  should  have  reliability  characteristics  in 
the  rehost  systen  that  are  equivalent  to  those  Cor  the  current  MAS 
application  software  since  this  software  will  not  be  changed  in  the 
rehosting  process.  Some  additional  application  software  failures  are 
expected  during  the  testing  phase  as  problems  in  the  interfaces  to  the  new 
MAS  monitor  components  and  the  VMM  are  identified.  These  interface  problems 
are  expected  to  be  resolved  before  operational  use.  It  is  important  to  note 
that  rehosting  the  NAS  application  software  will  preserve  its  current 
reliability  characteristics  and  cannot  improve  them.  However r  rehosting  the 
MAS  software  will  allow  changes  in  the  usage  of  that  software  which  would 
result  in  improvements  in  its  reliability.  For  example,  the  large  memory  of 
the  rehost  system  will  allow  all  program  elements  and  tables  to  be 
memory-resident  and  will  avoid  problems  with  program  element  and  table 
buffering.  This  will  improve  reliability  since  swapping  into  and  out  of 
main  memory  is  currently  a  significant  source  of  software  failures. 

The  NAS  monitor  will  be  modified  as  part  of  the  rehosting  process  to 
accommodate  changes  in  the  hardware  and  the  system  configuration.  These 
changes  will  degrade  the  initial  reliability  characteristics  of  the  MAS 
monitor  in  the  rehost  system.  After  some  period  of  operational  usage,  the 
reliability  characteristics  of  the  modified  NAS  monitor  can  be  expected  to 
return  to  the  current  level  of  reliability. 

The  VMM  represents  a  new  software  component;  unless  the  vm  were  perfect 
with  respect  to  reliability,  the  VtBt  would  result  in  some  degradation  of  the 
overall  software  reliability.  The  possible  range  of  failures  for  a  VMM  is 
indicated  by  the  current  commercial  experience  with  a  large  virtual  memory 
operating  system.  That  is,  this  system  has  1  to  3  failures  per  month 
(R0TL81,  p.  2-11]. 

The  net  effect  of  rehostlng  on  the  overall  reliability  of  the  HAS 
software  is  that  the  reliability  is  expected  to  be  about  equal  to  the 
current  reliability  after  some  period  of  operational  usage  since  the 
benefits  of  memory-resident  NAS  application  software  will  be  largely  offset 
by  defects  in  the  VMM. 
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2.5  Svatem  Ettliabllitv 

Sec.  2.2  produced  quentitetive  estiaates  of  the  hardware  availability 
and  MTBF  for  the  rehost  system  and  the  90200/9020E  system.  Since,  however, 
system  reliability  depends  both  on  hardware  and  on  software,  it  is  desirable 
to  extend  the  analysis  to  include  not  only  hardware  but  also  software.  The 
problem  is  that  the  information  needed  to  include  software  in  the  analysis 
is  not  available.  Nevertheless,  because  of  the  interest  in  the  reliability 
of  the  total  system  and  not  just  the  hardware,  some  ballparic  assumptions 
about  software  will  now  be  made  so  that  quantitative  estimates  can  be  made 
of  system  availability,  system  MTBF,  and  the  expected  duration  of  a  system 
outage.  It  must  be  stressed  that  these  assumptions  made  about  software  do 
not  have  a  solid  foundation;  they  are  made  here  for  illustrative  purposes. 
Six  assumptions  about  software  are  used. 

e  The  number  of  system  failures  caused  by  software  is  equal  to  the 
number  caused  by  9020D/9020E  hardware  (based  loosely  on  PAA 
operational  experience) . 

e  When  the  NAS  software  fails,  with  probability  0.9  the  failure  is 
transient  and  the  system  outage  during  the  dynamic  recovery  is 
exponentially  distributed  with  a  mean  of  30  seconds.  With 
probability  0.1  the  system  must  be  restarted,  and  the  resulting 
system  outage  is  exponentially  distributed  with  a  mean  of  15  minutes. 

e  The  VIM  fails  at  the  rate  of  twice  per  month. 

e  When  the  VMM  in  a  mainframe  fails,  the  system  outage  (while 

processing  is  transferred  to  the  other  mainframe)  is  exponentially 
distributed  with  a  mean  of  10  seconds.  (This  assumes  that  critical 
data  are  saved  every  five  seconds  and  all  software  is  loaded  and 
ready  to  run  in  the  back-up  system.) 

e  When  the  VMM  fails  in  a  mainframe,  the  time  that  the  mainframe  is 
down  while  the  VIM  is  restarted  is  10  minutes. 
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•  The  NAS  software  has  the  same  MTBF  in  the  rehost  system  as  in  the 
9020D/902QE  system. 

The  assumptions  made  about  hardware  are  that  the  unit  MTBF's  are  the  best 
estimates  in  Table  2-1  and  that  the  repair  times  are  independently r 
exponentially  distributed  with  a  mean  of  1  hour. 

This  treatment  assumes  that  the  NAS  software  is  coadsined  serially  with 
the  hardware;  a  failure  In  either  causes  a  system  failure.  The  VMM, 
however,  is  treated  like  a  component  in  a  mainframe,  e.g.,  just  like  the 
CPU.  When  the  VMM  fails,  processing  is  transferred  to  the  other  mainframe, 
and  the  first  mainframe  is  in  failure  mode  until  the  VMl  is  restarted. 

The  availability,  MTBF,  and  expected  duration  of  a  system  failure  that 
are  implied  for  each  system  are  calculated  in  App.  A  and  are  shown  in  Table 
2-7.  The  system  MTBF  for  the  rehost  system  of  1420  days  is  more  than  twice 
that  of  the  9020D/9020E  systen  of  613  days.  While  this  is  admittedly  a 
rough  calculation,  it  is  sufficient  to  refute  the  claim  that  the  improved 
reliability  of  the  rehost  hardware  would  be  cancelled  out  by  the  unii^roved 
NAS  software. 

2.6  Non-standard  System  Failures 

In  addition  to  the  hardware  and  software  failures  discussed  so  far, 
there  are  also  miscellaneous  failures  that  are  grouped  together  under  the 
heading  of  non-standard  failures.  These  failures  arise  from; 


TABLE  2-7:  AVAILABILITY,  MTBF,  AND  KPECTBO  OOIIATION  OF  A  SYSTEM  OCTAGE 


System 

Availability 

KTBF  (davs) 

Expected  Duration  of  a 

System  Outaae  (minutes) 

9020D/9020E 

0.99998191 

613 

16.0 

Kahost 

0.99998922 

1420 

22.0 

Source:  App.  A 
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•  op«catoc  eccocs, 

•  technician  repair  errors,  and 

e  exogenous  events  (e.g.,  earthquakes) 


It  seeas  Likely  that  exogenous  events  would  have  a  siailar  effect  both  on 
the  current  systeas  and  on  the  rehost  systea. 

Good  data  on  operator  and  technician  errors  is  apparently  not  available, 
but  discussions  with  FAA  personnel  indicate  that  huaan  error  accounts  for  a 
large  percentage  of  failures.  Since  the  rehost  systm  would  be  much  acre 
reliable,  we  would  expect  a  significant  reduction  in  these  failures.  That 
is,  with  fever  failures,  there  tiould  be  fewer  problems  requiring  operator 
intervention  or  repairs,  and  there  would,  therefore,  be  fewer  opportunities 
for  these  types  of  failures.  It  is  not  possible  to  go  beyond  this 
qualitative  statement  because  of  lack  of  data  and  understanding  of  these 
non-standard  failures. 

1.7  Suaaarv 

This  chapter  has  shown  that  in  terms  of  reliability  the  rehost  systea 
baa  both  advantages  and  disadvantages  when  compared  to  the  current  system. 
The  main  advantage  of  the  rehost  system  is  that,  because  it  uses  duplex 
processors  and  modern  technology,  it  has  significantly  greater  hardware 
reliability.  While  the  analysis  has  not  been  verified  empirically,  the 
increased  reliability  of  the  rebost  system  is  so  pronounced  that  it  seems 
unlikely  that  this  result  could  be  reversed  by  any  changes  to  the  analysis. 
The  aain  reliability  disadvantage  of  rebosting  is  that  it  would  require  an 
additional  software  coaponent,  the  virtual  machine  monitor,  and  this  would 
present  a  continuing  software  reliability  problem.  Other  software 
reliability  problems  would  result  because  of  the  changes  to  the  NAS  monitor, 
but  it  is  expected  that  these  problems  would  decline  in  importance  after  an 
initial  shakedown  period.  In  short,  the  rehostlng  systea  would  show  an 
increase  in  hardware  reliability  and  be  nearly  equal  in  software 
reliability. 
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The  analysis  of  Sec.  2.5  su9gests  (but  does  not  prove)  that  the  cehost 
systea  will  offer  a  significantly  greater  reliability  even  after  it  is 
recognised  that  the  NAS  code  will  still  be  used  and  that,  in  addition,  a 
virtual  oMtchine  aonitor  will  be  used  and  will  be  a  source  of  failures.  It 
should  be  eaphasixed  that  this  rough  analysis  does  not  provide  any 
definitive  answers,  but  it  does  provide  a  systeaatic  way  of  thinking  about 
the  question  of  how  rehosting  would  affect  systea  reliability. 
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3.  PBKFOSMMICE 


3.1  Purpo««  and  Otganlxatlon  of  this  Chapter 

The  pucpos*  of  this  chaptsc  is  to  estiaate  the  response  time  that  the 
rehost  system  could  provide  both  in  an  absolute  sense  and  also  compared  to 
the  9020 'a.  In  order  to  estimate  the  response  time  two  separate  analyses 
are  carried  out. 

The  first  analysis,  which  is  described  in  Sec.  3.2,  treats  the  entire 
system  as  a  single,  undifferentiated  server.  The  extent  to  which 
technological  progress  has  increased  hardware  speed  and  capacity  is 
discussed,  and  a  decrease  in  service  time  provided  by  the  system  is 
determined.  From  this  the  system  response  time  is  estimated.  This  rough 
analysis  shows  that  rehoating  does  make  sense  ttom  a  performance  point  of 
view}  a  more  detailed  analysis,  therefore,  is  justified. 

The  second  analysis  uses  the  technique  of  operational  analysis  to  look 
at  the  response  time  of  each  individual  component  as  a  function  of  its 
service  time  and  utilization.  The  response  times  of  individual  components 
are  then  added  to  obtain  the  system  response  time.  Sec.  3.3  explains  the 
principles  of  this  technique,  and  Sec.  3.4  applies  it  to  estimate  response 
times. 

Throughout  this  chapter  when  the  data  is  ambiguous  or  unsatisfactory, 
conservative  assumptions  are  used.  Therefore,  if  anything,  the  response 
time  of  the  rehost  system  would  be  better  than  the  conservative  estimates 

made  here. 

3.2  A  Global  Performance  Analysis 

3.2.1  Introduction 

The  9020  systems  are  modified  versions  of  the  IBM  360  secies  of 
computers,  which  was  designed  in  the  early  sixties  and  introduced  into  the 
market  in  the  mid-sixties.  The  9020  systems,  therefore,  generally  use 


hardware  technology  and  programming  methodologies  that  were  developed  in  the 
1960-196S  timeframe.  During  the  Intervening  ttio  decades,  technology  has 
progressed  rapidly,  and  this  trend  will  continue  in  the  foreseeable  future. 
For  Instance,  semiconducter  chip  technology  has  advanced  by  five  orders  of 
magnitude  (10^)  in  the  1965-1980  timeframe. 

If  a  decision  to  rehost  the  NAS  software  is  made  in  the  near  future,  it 
would  take  several  months  in  calling  for  bids,  evaluating  them  and  making  a 
final  choice  between  the  alternative  systems.  In  our  analysis,  we  therefore 
include  systems  that  would  become  available  during  1981  and  1982. 

In  the  succeeding  paragraphs,  we  consider  various  system  elements 
separately. 

3.2.2  Central  Processing  Onits 

The  processing  capacity  of  the  9020A  CE  (7201-1)  and  the  9020D  or  9020B 
CE  (7201-2)  have  been  identified  (NHAl,  p.  3-2]  as  286  KOPS  (kilo-operations 
per  second)  and  1,000  KDPS,  respectively.  These  values  have  been  used  for 
the  performance  calculations  and,  while  they  are  different  from  the  KOPS 
values  (I.IAS80,  p.  104]  used  elsewhere  in  this  report,  these  differences 
will  not  significantly  affect  the  results  of  the  performance  analysis. 

IBM  models  now  on  the  market  that  are  upward-compatible  with  System/360 
span  the  spectrum  from  2,300  KOPS  through  22,200  KOPS.  Thus  the  CPU  speed 
accelerator  factor  is  1.6  to  IS. 2  times  the  performance  of  a  9020A  system, 
which  consists  of  3  9020A  processors  and  2  lOCE  processors;  it  is  0.9  to  8.6 
times  that  of  a  9020D  system,  which  consists  of  2  9020D  processors  and  2 
lOCE  processors.  Taking  an  average,  the  speed  increase  factor  can  be  up  to 
12  times  the  performance  of  a  9020  system. 

3.2.3  Memory  Onits 

The  9020A  and  the  90200  systems  are  today  equipped  with  on-line  memory 
units  aggregating  2.25  megabytes  and  2.50  megabytes,  respectively.  The 
upper  limit  on  currently  available  systems  is  generally  either  16  megabytes 
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or  32  aegabytos,  vith  newer  systeaw  of  up  to  64  aegabytes  expected  In  the 
next  one  to  t%>o  years.  Thus#  aeaory  capacity  of  either  a  9020A  or  a  90200 
can  be  enhanced  by  up  to  25  times  by  using  newer  memory  hardware.  On  the 
speed  front/  however/  the  performance  is^rovement  is  not  so  dramatic.  The 
9020A  storage  element  has  a  cycle  time  of  2.5  microseconds  for  an  access 
width  of  4  byteSf  and  the  90200  storage  element  has  a  cycle  time  of  0.8 
microseconds  for  8  bytes.  Today's  systems  exhibit  memory  cycle  times  of 
0.3-0. 4  microseconds  per  8  bytes,  and  thua  the  speed  increase  factor  is 
between  2  and  8.  However,  almost  all  new  systema  offer  a  fast  cache  memory 
whereas  the  9020  did  not  implement  a  cache;  these  caches  have  capacities  of 
up  to  64  kilobytes,  and  their  cycle  time  is  between  50-100  nanoseconds.  As 
the  cache-hit  ratio  neats  100%,  the  effective  memory  cycle  time  becomes 
equal  to  the  cache  cycle  time.  This  would  represent  a  speed-up  factor  of  up 
to  IS  over  a  90200  memory  system.  On  the  whole,  and  preferring  to  err  on 
the  conservative  side,  we  expect  memory  speed  increase  to  be  between  2  and 
10. 


3.2.4  Disk  Onits 

The  existing  IBM  2314  disk  units  have  a  maximum  capacity  of  30  megabytes 
and  a  maximum  data  transfer  rate  of  625  kilobytes  a  second.  The  disk 
service  time  of  34  milliseconds  is  composed  primarily  of  the  access  time  and 
the  seek  time  (both  mechanical  functions)  and,  by  comparison,  a  small  data 
transfer  time,  especially  for  small  data  blocks.  For  example,  a  2K  data 
block  has  a  total  transfer  time  of  34  milliseconds,  of  which  3  milliseconds 
is  transfer  time.  Today,  300  megabyte  disks  are  common,  and  1000  megabyte 
disks  have  recently  became  commercially  available.  Thus  the  capacity 
increase  factor  for  disks  is  between  10  and  30.  The  disk  transfer  times 
have  improved  from  625  kilobytes/second  to  about  2000  kilobytes/second,  a 
factor  of  3.  The  improvement  factor  for  access  times  and  seek  times  is 
between  1  and  2  only.  However,  since  the  primary  memory  would  be  much 
larger  than  that  of  the  existing  system,  a  substantial  number  of  programs, 
and  possibly  some  flight  plan  data,  could  under  rehosting  reside  in  main 
memory,  thus  greatly  reducing  the  number  of  disk  accesses.  With  this 
revised  system  design,  we  estimate  that  the  overall  speed  increase  in  disk 
service  tine  would  at  least  be  between  2  and  3. 
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3.2.5  Tape  Onits 


The  technology  trend  for  tapes  is  similar  to  that  for  disks.  The 
capacity  increase  factor  will  be  between  5  and  10,  and  the  overall  speed 
increase  factor  between  2  and  3.  Since  the  disk  units  will  now  have  much 
higher  capacitiesf  it  is  possible  to  store  archival  data  on  disks  rather 
than  on  tapes  as  is  presently  dons.  This  will  improve  the  speed  increase 
factor . 


3.2.6.  Other  CCC  Equipment 

Aside  from  the  devices  considered  in  the  foregoing  paragraphs,  the  CCC 
consists  of  input  devices,  peripheral  adapter  modules  (PAM's) ,  and 
channels.  Computer  systems  today  offer  4-16  channels  per  CPU,  each  with  a 

capacity  of  2-10  megabytes  per  second;  besides,  the  facility  of  ^ 

block-multiplexor  mode,  in  additj';n  to  the  traditional  selector  and 

multiplexor  modes,  will  mitigate  any  channel  bottlenecking.  The  9020A,  as 

well  as  the  90200,  have  two  PAM  units  each.  The  load  on  all  these  units  is 

still  below  their  capacity  limits,  and  taking  into  account  the  technological 

improvements,  it  is  unlikely  that  there  would  be  any  problem  in  this  area 

from  the  performance  viewpoint. 

3.2.7  Display  Equipment 

Xn  the  baseline  rehost  configuration,  the  display  channels  (Raytheon  730 
and  9020B)  would  be  replaced  while  the  display  generators  and  controller 
suites  would  be  retained.  The  processing  capacity  necessary  to  support  the 
display  channel  functions  is  very  low.  That  is,  current  estimates  of  the 
display  processor  utilisation  range  from  1%  to  12%.  The  effects  of  the 
display  channel  workload  on  the  rehost  system  will  be  minimal. 

3.2.8  Computational  Workloads 

A  raport  prepared  by  the  Transportation  Systems  Center  [CLA979] 
indicates  that  the  air  traffic  volume  has  been  increasing  at  the  rate  of 
4.4%  par  annua,  a  doubling  every  15  years.  This  implies  that  if  no  system 
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anbanc«ments  ace  made  during  the  intervening  period,  the  coaputational 
wocicload  in  1990  would  be  roughly  twice  that  which  existed  in  1975,  and  the 
1995  workload  would  be  twice  the  1980  workload.  The  cehoat  systes  design 
could  be  aodified  to  benefit  froa  the  larger  capacities  of  priury  smsocy, 
disks  and  tapesi  this  would  enable  soae  neac-teca  enhancements  in  MAS  system 
capabilities  to  be  introduced  without  degradation  of  system  performance.  As 
such,  it  would  still  be  appropriate  to  assume  a  doubling  of  workload  every 
15  years  as  a  ballpark  figure  (assuming  no  change  in  level  of  automated  ATC 
services  or  demand.) 

3.2.9  Rough  Calculations 

Assuming  that  the  arrival  rates  and  the  service  times  ace  exponentially 
distributed,  the  response  time  of  a  given  server,  or  device,  can  be 
calculated  using  the  formula: 


Response  Time 


Service  Time 

1-  (Service  Time)  x  (Arrival  Rate)  * 


(1) 


For  a  fixed  arrival  rate,  this  formula  shows  that  if  the  service  time 
doubles,  the  response  time  will  more  than  double;  likewise,  if  the  service 
time  falls  by  50%,  the  reduction  in  response  time  would  exceed  50%. 


The  analysis  of  subsections  3.2.2  through  3.2.7  is  summarized  in  Table 
3-1.  This  table  shows  that  the  speed  improvement  factor  is  between  2  and  50 
depending  on  the  nature  of  the  device,  or  that  the  individual  service  times 
will  reduce  by  somewhere  between  50%  and  98%  of  the  respective  existing 
times.  In  the  most  conservative  case,  assume  that  the  reduction  is  50%  for 
all  devices,  electronic,  mechanical,  or  whatever. 


Over  a  15  year  timeframe,  the  arrival  rate  of  transactions  is  expected 
to  double,  hence  the  product  of  service  time  and  arrival  rate  (defined  as 
utilisation)  will  remain  constant.  Thus,  in  equation  (1),  the  denominator 
will  remain  constant,  and  the  numerator  will  be  halved,  hence  the  time 
interval  from  a  request  for  service  to  the  completion  of  the  service 
(response  time)  will  be  reduced  by  50%.  The  reduction  factor  means  the 
response  time  of  the  rebosted  system  in  1995  will  be  one-half  the  response 


42 


TABLE  3-1:  TECHNOLOGY  TEEMDS 


Device 

Speed  Increase 

factor  * 

Capacity  Increase 

Factor  ** 

CPU 

Up  to  12 

Same  Instruction  Set 

Memory 

2-10 

Op  to  25 

Disk 

2-3 

10-30 

Tape 

2-3 

5-10 

*  Coaparcd  to  devices  used  In  the  9020  system 
**  Compared  to  maximum  capacity  of  a  9020A  or  90200  system 


time  of  the  existing  system  today.  Since  we  have  assumed  very  conservative 
technology  factors  throughout#  and  have  deliberately  preferred  to  err  on  the 
safe  side  while  calculating  acceleration  factors,  it  can  be  concluded  that 
rehosting  would  maintain  an  acceptable  response  time  under  the  expected  air 
traffic  levels. 

The  above  analysis  also  indicates  that  it  is  not  really  necessary  to  use 
the  most  advanced  and  most  capable  current  technology  computer  system  to 
achieve  rehosting.  We  now  focus  on  a  typical-sized  computer  that  should  be 
adequate  for  rehosting  the  NAS  software. 

3.3  An  Operational  Analysis  of  Performance:  Principles 

3.3.1  Overview 

This  subsection  provides  a  scenario  for  the  analysis  of  the  performance 
of  the  rebost  system  in  terms  of  its  resource  utilisations  and  response 
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tiMS.  Th«  priae  source  of  the  data  used  la  a  collection  of  relevant 
reports  on  the  performance  analysis  of  the  existing  systeas  [KAN0771 , 
[NIEL77a] ,  [WHhlSl] .  A  detailed  analysis  requires  aore  data  on  the  workload 
and  the  future  system  characterization  than  are  available  at  this  time  due 
to  a  different  orientation  of  the  perforaance  recording  of  the  existing 
systeas  and  the  lack  of  a  benchmark  on  the  future  rehost  system.  An  attempt 
Is  made  to  use  available  data  to  project  the  performance  of  the  rehost 
system  through  a  set  of  qualitative  analyses.  Moreover,  as  pointed  out  In 
Sec.  3.2,  the  utilization  of  the  present  display  channels  (COC  or  DCC)  Is 
relatively  low,  and  as  such  Its  contribution  to  the  total  computational 
workload  Is  minimal;  the  Impact  of  transferring  these  functions  to  the 
rehoated  system  would  also  be  marginal.  Therefore,  this  performance 
analysis  will  concentrate  on  the  workload  of  the  CCC. 

Throughout  this  section  the  operational  analysis  technique  (DBMM781  Is 
employed.  The  simplest  form  of  this  technique  embodies  the  following 
equation; 

R  -  S/(l-u), 

where  R  Is  the  response  time  of  a  certain  type  of  mrkload,  S  Is  the  service 
time,  and  u  Is  the  utilization  of  the  resource  In  question.  The  resource 
may  be  an  active  server  (e.g.,  CPO,  channel,  devices)  or  a  passive  server 
(e.g.,  data  base,  program  elements).  The  operational  analysis  technique 
relaxes  the  restrictions  on  the  distributions  of  the  arrival  rates  and  the 
service  times.  The  only  assumption  used  Is  that  the  flow  of  transactions 
through  the  system  Is  balanced,  which  Is  sstlsfled  In  the  system  being 
evalustad. 

The  method  used  In  evaluating  the  performance  of  the  reboat  system  Is 
summarized  below. 

(1)  Identify  the  characterizations  of  a  typical  rehost  system. 
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(2)  Identify  a  base  acanario  workload:  tha  chacactarizationa  of 
typical  tranaactiona  and  their  arrival  ratea  under  a  apecified  air 
traffic  load  on  tha  praaent  9020A  ayatan. 

(3)  Conduct  an  analyaia  to  obtain  tha  utilization  ratea  and  reaponae 
tizHia  of  the  rehoat  ayatem  under  the  work  load  of  the  baae 
acenario  identified  in  (2)  above.  Thia  analyaia  will  be  done  by 
integrating  the  utilization  ratea  and  reaponae  timea  calculated 
for  each  reaource  type  in  the  rehoat  ayatem. 

(4)  Compare  the  numbera  obtained  in  (3)  with  the  performance  figurea 
of  the  praaent  ayatem  to  derive  in^rovement  ratioa. 

(5)  Extrapolate  the  impact  of  tha  increaaed  air  loada  on  the  arrival 
ratea  of  the  tranaactiona  and  perform  a  aenaitivity  analyaia  of 
the  rehoat  ayatem  performance. 

During  the  proceaa  of  analyaia,  aeveral  aaaumptiona  are  made  where  data 
ia  lacking.  Theae  are  daacribed  aa  the  analyaia  proceeda. 

3.3.2  Characterization  of  a  Typical  Rehoat  Syatem 

The  hardware  configuration  of  the  rehoat  ayatem  haa  been  briefly 
deacribed  in  Sec.  1.3.  The  CPO  apeed  of  the  rehoat  machine  ia  aaaumed  to  be 
5900  KOPS,  auch  aa  found  in  current  generation  mainframea  such  aa  the  IBM 
30330  or  Amdahl  V7  (LIASSO,  p.l04] .  However,  due  to  the  aenaitivity  of  thia 
apeed  factor  to  the  cache-hit  ratio  of  the  CPO  tdten  running  the  NAS 
programa,  we  aaaume  a  5%  degradation  of  apeed  performance  to  arrive  at  an 
effective  apeed  of  5605  KOPS.  The  memory  aize  of  the  rehoat  ayatem  will  be 
at  leaat  8  megabytea,  with  a  growth  potential  of  up  to  16  megabytea.  Aa  the 
aize  of  the  preaent  NAS  aoftware  ia  eatimated  to  be  around  4.1  megabytea,  it 
ia  expected  that  in  the  rehoat  ayatem  all  buffered  program  elementa  (PE 'a) 
and  buffered  flight  plan  data  are  to  be  memory-reaident,  thua  eliminating 
the  need  for  awapping.  The  choice  of  the  mu»ty  aize  of  the  rehoat  ayatem 
ahould  aim  at  elimination  of  awapping.  Future  increaae  in  the  aize  of  the 
NAS  aoftware  due  to  functional  enhancementa  or  increaae  in  the  aize  of  the 
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data  bases  should  be  taken  into  account  in  deciding  the  size  of  the  menocy 
of  the  rehost  system. 

The  channels  in  the  cehost  system  will  have  the  block  multiplexing 
capability  and  a  much  higher  transfer  rate  (e.g.,  2.6-6  Megabyte/Second). 

The  disk  and  tape  units  will  also  be  replaced  by  modern- technology 
counterparts. 

The  rehost  system  will  run  VM/370  to  ease  the  environmental  changes  for 
the  MAS  monitor  and  the  application  software.  Current  commercial  experience 
indicates  that  VM  processing  results  in  about  25t  CPU  overhead,  which 
reduces  the  effective  speed  of  the  rehost  CPU  to  4203  KOPS. 

Since  software  rehost  minimizes  modification  to  current  NhS  software, 
non-reentrant  PE's  and  queueing  delays  due  to  PE  or  database  lockups  will 
still  exist. 

Other  input  and  output  devices  such  as  non-radar-keyboard  and  flight 
atrip  printers  will  be  retained.  Kith  the  exception  of  flight  strip 
printers  and  POBP's  these  devices  are  not  considered  highly  utilized  and 
will  not  significantly  contribute  to  the  response  times.  Therefore  our 
analysis  will  concentrate  on  the  CPU,  channel,  disk  and  program  utilizations. 

3.3.3  Characterization  of  a  Typical  Transaction 

A  transaction  is  characterized  by  its  resource  service  time  and  arrival 
rata.  Normally  the  transactions  processed  by  a  system  are  grouped  into  a 
small  number  of  classes;  transactions  within  each  class  consume  similar 
amounts  of  resources  and  have  other  similar  properties  (e.g.,  priorities). 
However,  the  present  system  recording  in  the  9020 's  does  not  provide 
resource  consumption  on  a  per  transaction  basis.  Therefore,  for  the  purpose 
of  a  preliminary  performance  analysis,  we  have  aggregated  all  resource 
utilizations  and  distributed  them  among  all  input  transactions  (Including 
radar  and  timer  messages)  to  derive  the  resource  consumption  of  a  "typical" 
transaction.  To  do  so,  we  make  use  cf  the  aggregate  data  provided  by 
[NHAI81] . 
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CPO  and  Di«k  Tiae 


[WHA181]  ptovidea  the  following  data  foe  the  Houston  9020A  sits: 
Transaction  arrival  rate  -  12,960  per  hour 
CPU  utilisation  -  73%  (or  2.2  out  of  3  CE's) 

Disk  utilization  ~  38%  per  disk 

Track  count  >  110 

From  the  above  data,  we  derive  the  following: 

CPU  time  per  transaction  >  611  ms 
Disk  time  per  transaction  ■  211  ms 

The  CPU  time  per  transaction  is  derived  as  follows: 

(1)  Total  CPU  time  ■  (3,600,000  ms/hour)  x  (number  of  CE's  busy) 

(2)  CPU  time  per  trans.  ■  Total  CPU  time/arrival  rate, 

and  the  disk  time  per  transaction  is  derived  as  follows: 

(1)  Total  disk  time  ■  (3,600,000  ms/hour)  x  (disk  utilization  x  2) 

(2)  Disk  time  per  trans.  ■  Total  disk  time/arrival  rate. 

3.4  An  Operational  Analysis  of  Performance:  Results 

3.4.1  Scenario  Analysis 

This  subsection  presents  an  analysis  of  th»  performance  of  the  rehost 
system  when  running  the  base  scenario  workload  described  in  the  previous 
subsection. 
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CPO  perfotaance.  A  coapacison  b«tw«an  the  CPU  speeds  of  the  present 
systens  and  the  CPO  speed  of  the  cehost  systea  is  made  to  derive  the  speed 
ratios  shown  in  Table  3-2. 

The  adjusted  rehost  CPO  speed  has  been  derived  by  taking  into  account 
the  best  estimates  for  VM  overhead  (25%)  and  the  cache-hit  ratio  degradation 
(5%) 


Employing  the  basic  operational  analysis  equation >  R  •  S/(l-u),  the 
figures  in  Table  3-3  are  obtained  which  characterize  the  CPO  response  time 
per  transaction  in  the  present  and  the  rehost  systems. 

Disk  time.  The  present  2314  units  have  an  access  time  comparable  to 
that  of  the  3330 's>  the  replacement  disks.  However,  the  disk  utilization 
will  be  dramatically  reduced  in  the  rehost  system  due  to  the  elimination  of 
buffered  program  elements  and  flight  plan  data  bases.  The  Logicon  studies 
provide  the  information  on  the  disk  activities  shotm  in  Table  3-4. 

Baaed  on  this  observation,  it  is  assumed  that  the  disk  utlU.^clon  will 
be  reduced  by  68.6%  for  the  9020A  systems.  These  are  translat«^  ^  -O 
co^>arable  reductions  in  the  disk  times  per  transaction  in  these  systens  and 
the  characterizations  for  the  disk  activities  in  the  rahost  system  are  shorn 
in  Table  3-5. 

Channel  tine.  The  utilizations  of  the  two  selector  channels  in  the 
present  system  are  directly  related  to  the  disk  and  tape  activities.  While 
the  channel  time  is  not  considered  significant  and  therefore  not  fully 
analyzed  in  the  Wilson-Hill  study,  the  following  predictor  equation  was 
given  in  the  Logicon  report  [XA1)D77,  p.  3-22] : 

Channel  Ottlizatlon  %  >  Disk  u.  x  (25  ns/access  time)  SAR%  *  REM0N% 

■  Disk  u.  X  (25  ns/access  time)  +  (.0732  x  Active) 
♦  6.35, 
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TABLE  3-2:  PBOCBSSOR  SPEED  COMPARISONS 


CPO  Speed  (KOPS)  per  CE 
Speed  lopcoveaent  Ratio* 

*  nocMllzed  by  the  adjusted  rehost 


Svstea 


9020A 

Rehost 

Adjusted  Rehost 

286 

5900 

4203 

14.7 

.71 

1 

speed. 

TABLE  3-3:  CPO  RESPONSE  TIME  PER  TRANSACTION  FOR  THE  9020A  AND  THE  REHOST 
SYSTEM 


System 

9020A 

Rahost 

CPO  tlae  (ms)/  trans. 

611 

41.6 

CPO  utilization  (per  CE) 

0.73 

0.15 

CPO  response  time  (as) 

2263 

49 

49 


TABLE  3-4:  INFORMATION  ABOUT  DISK  ACTIVITY  FOR  THE  9020A  AT  MEMPHIS 


P«tc>ntaq»  of  Total  Disic  Actlvttias 
Buffcrtd  Flight 


Track  Count 

BuYfered  PE 

Plans 

_  Total 

124 

50.0 

18.6 

68.6 

Source: [NlBL77a] 


TABLE  3-5:  DISK  RESPONSE  TIME  FOR  THE  9020A  AND  THE  REHOST  SYSTEM 


SvsteM 


9020A 

90 20 A  Rehost 

Disk  tine  (■s)/trans. 

211 

66.25 

Disk  utilization  (per  disk) 

38% 

12% 

Disk  response  time  (as) 

340 

75 

50 


tfb«re  Activ*  is  ths  activs  fli9ht  account,  and,  from  tha  sasM  caport,  is 
found  to  ba  approxiaataly  1.5  tiiaas  tha  track  count.  Also  it  is  astiaatad 
by  Logicon  that  tha  avaraga  disk  accass  tim  is  38  ms  [KAMD77,  p.  S-ll] . 
Basad  on  tha  abova  discussion  and  analysis  in  tha  pravious  paragraphs,  tha 
figures  shown  in  Table  3-6  are  derived. 

In  tha  rahost  system  tha  reduction  of  disk  usage  combined  with  tha 
introduction  of  tha  block  multiplexor  channels  is  expected  to  dramatically 
reduce  tha  channel  utilii:^  :ion.  Furthermore,  the  channels  to  be  used  in  the 
rehost  system  will  have  a  transfer  rate  up  to  2.6  megabytes  par  second, 
approximately  6.5  times  that  of  tha  present  system.  It  is  therefore 
concluded  that  tha  channel  wait  time  in  tha  rehost  system  as  a  percentage  of 
tha  total  response  time  will  be  negligible,  and  it  is  Ignored  in  our 
response  time  analysis. 

Passive  servers.  By  adding  up  the  CPO  and  the  disk  response  times  and 
the  channel  wait  time  presented  above,  one  obtains  the  expected  response 
time  per  'typical*  transaction  without  regard  to  output  device  1/0  delays 
and  delays  due  to  data  base  locks  and  non-reentrant  PS  locks.  Because  the 
output  devices  are  in  general  not  to  be  replaced  in  the  rehost  system  and 
their  service  times  are  not  included  in  the  response  time  definitions  as 
specified  in  MAS-MD-318,  they  will  not  be  considered  in  this  performance 
analysis.  However,  the  PS  and  data  base  locks  are  potential  contributors  to 
tha  response  times  in  both  the  present  systems  and  the  rehost  systems. 

Judging  by  Wilson-Hill's  experience  in  performance  modeling,  the  PS  lock 
delay  is  expected  to  be  substantial,  %rttile  the  data  locks  do  not  contribute 
significantly  to  the  overall  response  time.  A  pessimistic  assumption  is 
made  for  the  purpose  of  analysis  that  the  PS's  as  a  passive  server  for  our 
‘typical*  transaction  have  an  upper  bound  of  60%  utilisation.  This  means 
that  this  passive  server  has  an  average  service  time  equivalent  to  the 
aggregated  average  active  server  times  per  transaction,  and  is,  under 
current  load,  60%  utilised.  The  purpose  of  this  worst  case  assumption  is  to 
predict  how  the  rehost  system  will  perform  under  this  adversity.  That  is, 
the  overall  response  times,  taking  into  consideration  the  PE  and  data  base 
locks,  are  derived  from  the  total  active  server  times  using  the  PE 
utilisation  and  are  shown  in  Table  3-7. 


TABLB  3-6:  CBAMMBL  NKIT  TIME  FOR  THE  9020A* 


Channal  tla*  (■s)/trana.  120 

Channal  utilization  (par  channal)  43.3% 

Channal  wait  tiaa  (aa)  90.5 


**Wait  tina"  ia  daflnad  to  be  the 

TABLE  3-7: 

CPU  caaponaa  tioM/tcana.  (aac) 
Oiak  caaponaa  tiOM/tcana.  (aac) 
Channal  wait  tiow/trana.  (aac) 
Total  active  aacvar  tiae  (aac) 

PE  utilization 

Raaponae  tiae/trana.*  (aac) 

*  including  wait  for  PE  locka 


caaponaa  time  ainua  aacvica  tiaa. 


OVERALL  RESPOMSE  TIMES 


Svataa 


9020A 

Rahoat 

9020A 

lapcovaaant 

Ratio 

2.263 

0.049 

46.2 

0.340 

0.075 

4.5 

0.091 

- 

2.694 

0.124 

27.2 

60% 

2.8% 

27.2 

6.735 

0.128 

52.6 
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Not*  that  this  passlv*  server's  service  tlae  is  proportional  to  the 
total  active  server  tiae,  and  therefore  under  the  sssm  air  load  will  be  very 
sensitive  to  the  technology  used  by  the  active  servers. 

3.4.2  Sensitivity  Analysis 

This  subsection  presents  an  analysis  of  perforaance  of  the  rehost  systea 
under  varying  air  traffic  loads.  The  purpose  is  to  project  the  rehost 
systea  perforaance  into  the  1990  tiaefraae  based  on  the  traffic  load 
predictions.  It  was  evident  frea  the  Logicon  and  the  flilson-Hill  studies 
that  the  9020A  systeaa  are  already  occasionally  failing  to  provide  adequate 
services  under  today's  air  traffic  load.  Their  perforaance  in  the  1990's  is 
not  analyzed  here. 

The  basic  assuaption  underlying  the  present  analysis  is  that  the  arrival 
rate  of  "typical  transactions”  is  largely  proportional  to  the  track  count 
handled  by  the  center  (PSBSOl,  p.  3-1] . 

As  all  ARTCC's  are  required  to  handle  the  peak  traffic  load  with 
adequate  perforaance,  the  projected  peak  track  counts  presented  in  (AP0811 
are  used  as  a  basis  for  projection.  Note  that  these  nuabers  represent 
traffic  load  in  the  busiest  center  in  the  country;  the  average  centers  will 
be  handling  peak  track  counts  auch  lower  than  these.  Note  also  that  in  our 
base  scenario  presented  in  the  previous  subsection,  the  track  count  was  110 
while  the  9020A  CPO  utilization  was  73%.  Coaparing  this  data  with  that 
reported  in  the  1977  Logicon  report  on  the  Meaphis,  9020A  site,  which  cites 
a  $6%  CPV  utilisation  with  a  track  count  of  124,  it  sesas  that  the  CPO 
workload  per  transaction  in  our  base  scenario  is  on  tbs  high  side. 

Therefore,  the  projected  CPO  utilization  is  expected  to  be  on  the  high 
side.  The  results  are  shown  in  Table  3-8.  These  calculations  show  that  the 
perforaance  of  the  rshost  systea  will  reaain  satisfactory  through  the  aiddle 
of  the  1990 's.  Note  that  as  the  load  increases,  the  actual  perforaance  will 
be  increasingly  acre  sensitive  to  the  validity  of  the  paraaetars  and 
assuaptlons  used  in  the  baa*  scenario  analysis.  Since  the  base  scenario 
uses  conservative  estiaatsa  and  assuaptlons,  the  actual  perforaance  of  the 
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cahost  syataa  in  the  1990 *■  ia  likely  to  be  far  better  then  that  preaented. 
However r  rince  the  CPU  ia  the  bottleneck  in  the  analyaia  for  Table  3-8, 
another  analyaia  baaed  on  the  aaauaption  that  the  rehoat  ayatea  would  have  a 
10,000  HOPS  CPO  ia  preaented  in  Table  3-9. 

The  CPO  could  be  upgraded  in  nany  waya  for  the  rehoat  aystea  to 
accoaaodate  unexpected  growth  in  air  traffic  or  uncertainty  in  the 
paraaeters  for  the  analyaia.  For  exai^le,  one  candidate  CPO  for  the  rehoat 
ayatea,  the  Aadahl  470/V7,  can  be  field-upgraded  to  a  aodel  V8  and  raiae  the 
groaa  proceaaing  capacity  of  the  rehoat  ayatwa  froa  5,950  HOPS  to  6,375  HOPS 
[LIAS80,  p.  104]. 
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TABLE  3-8:  PERFORMAMCE  PROJECTION  OF  THE  REHOST  SYSTEM^ 


2 

R«ho8t  System 


9020A 

Base 

1980 

1985 

1990 

1995 

Track  count 

no 

no 

319^ 

384^ 

486^ 

597' 

CPO  utilisation 

.73 

.15 

.44 

.52 

.66 

.81 

CPO  response  time  (ms) 

2,263 

49 

74 

87 

123 

224 

Disk  utilisation 

.38 

.12 

.35 

.42 

.53 

.65 

Disk  response  time  (ms) 

340 

75 

102 

114 

141 

190 

Total  active  server  time  (ms) 

2,694 

124 

175 

201 

264 

414 

PE  utilisation 

.60 

.028 

.11 

.16 

.26 

.51 

Overall  response  time  (ms) 

6,735 

128 

197 

239 

357 

845 

This  prediction  is  conasevativs  in  that  it  is  dasignad  to  ba  tha  worst 
casa  prediction  Cor  rahosting. 

A  5900  HOPS  CPO  is  assuaad,  a.g.«  an  IBM  30330  or  an  Amdahl  V7. 

[AP081]  These  figures  are  the  forecasts  of  the  peak  track  count,  which 
by  definition  is  the  largest  track  count  sustained  over  a  seven  ninute 
period.  In  every  case  the  peak  is  at  the  Chicago  ARTCCi  peaks  at  most 
of  the  other  ARXOC's  are  considerably  smaller. 


TABLE  3-9 t  PBBFOBMAMCE  PBBDZCTIOM  OF  A  RBHOST  SYSTEM 
(WITH  A  10,000  EOPS  CPO)^'* 


Bass  Scsnario 

1980 

1985 

1990 

1995 

Track  count 

110 

319^ 

384^ 

486^ 

597^ 

CPO  utilization 

.09 

.26 

.31 

.40 

.48 

CPU  rssponsa  tiaw  (ns) 

27 

33 

36 

41 

48 

Disk  utilization 

.12 

in 

a 

.42 

.53 

.65 

Disk  rssponsa  tizw  (as) 

75 

102 

114 

141 

190 

Total  activs  ssrvsr  tins  (ns) 

102 

135 

150 

182 

238 

PE  utilization 

.023 

.088 

.12 

.18 

.29 

Ovsrall  rssponas  tins  (ns) 

104 

148 

170 

222 

335 

This  pesdictlon  is  consstvstivs  in  that  it  is  dssignsd  to  bs  tbs  worst 
csss  pcsdiction  for  rshosting. 

Ths  IBM  3081  and  Aadabl  5860  will  ptovids  at  Isast  10,000  KOPS. 

(AP081] 
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4.  TECHNICAL  ISSUES 


4.1  Introduction 


Chnptncs  2  and  3  have  axaninad  tha  lavala  of  raliability  and  performance 
that  could  be  achieved  by  reheating.  These  chapters  have  implicitly  assumed 
that  the  current  NAS  software  can  indeed  be  made  to  run  successfully  on  the 
replacement  machine.  The  purpose  of  this  chapter  is  to  identify  the 
problems  that  might  keep  the  rahosted  software  from  running  and  to  indicate 
how  these  problems  might  be  dealt  with.  Sec.  4.2  discusses  the  special 
instructions  executed  by  the  9020  and  Sec.  4.3  discusses  special  features  of 
the  9020  environment. 

4.2  Special  Instructions 

4.2.1  Introduction 

Tha  9020  computers  are  capable  of  executing  all  of  the  standard  IBM 
Systsm/360  instructions  plus  several  special  instructions  [IBM73] .  These 
special  instructions  are  shown  in  Table  4-1  along  with  the  number  of  times 
each  occurs  in  the  NAS  CCC  code.  These  special  instructions  are  essential 
for  the  operation  of  the  NAS  software  in  a  multi-processor  and 
multi-processing  enviroment.  However*  as  indicated  in  Table  4-1*  their 
static  usage  is  quite  low  (less  than  0.1%  of  all  the  instructions  in  the  NAS 
software) .  In  addition*  the  usage  is  confined  to  about  10%  of  all  modules 
[FAAT811  and  most  of  these  modules  support  startup*  startover  and  diagnostic 
functions. 

4.2.2  NAS  Application  Software 

The  usage  of  9020  special  instructions  in  the  NAS  application  software 
has  been  investigated  at  the  FAA  Technical  Center  [FAAT81]  as  part  of  their 
effort  to  demonstrate  that  the  flight  data  processing  (FDP)  subsystem  of  the 
MAS  software  could  be  rehosted  on  an  IBM  4341  computer.  The  results  of  this 
investigation  are  that  only  three  special  instructions  are  directly  used  in 
the  NAS  application  software}  they  are: 
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TABU  4-1:  9020  SPECIAL  IMSTRDCTIONS  AND  THEIR  USAGE 


Inatr action 

Mnemonic 

Op  Code 

NAS  Dsaqe 

Coments 

Sat  configuration 

SCON 

01 

12 

Delay 

DLY 

OB 

62 

Load  identity 

LI 

OC 

16  (est) 

Set  address  translation 

SATR 

OD 

5 

Insert  address  translation 

lATR 

OE 

IS 

Conflict  with 

MVCL 

Load  data  address 

LDA 

99 

1 

lOCE  only 

Start  lOCEp 

Slop 

9A 

16 

Set  PCI 

SPCI 

9B 

0 

Store  PS  base  register 

SPSB 

AO 

S 

Load  PS  base  register 

LPSB 

A1 

18 

Hove  word 

NVW 

08 

428 

Convert  and  sort  syabol 

CSS 

02 

? 

9020E 

Convert  weatherline 

CVHL 

03 

? 

9020E 

Repack  synbol 

RPSB 

OF 

7 

9020E,  Conflict 

with  CLCL 

Load  chain 

LC 

52 

? 

9020E 

Source:  [IBM73]  and  [IBN75] 
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•  0€lay, 

•  Load  identity, 

•  Move  word. 

The  remaining  special  instructions  support  supervisor  needs  in  a  multi¬ 
processor  system  and  occur  only  in  the  MAS  monitor. 

The  support  for  these  special  instructions  can  be  provided  in  many  ways 
in  an  instruction-compatible  cmputer;  one  approach  based  on  the  FOP 
demonstration  [FAAT81]  is: 

e  Delay:  Trap  the  operation  code  and  suspend  the  program  element  (PE) 
for  the  specified  delay  interval.  Delay  is  used  for  synchronizing 
modules  (not  needed  in  a  uniprocessor  environment)  and  for 
accommodating  communication  circuit  transients. 

e  Load  identity:  Trap  the  operation  code  and  return  a  fixed  value 
since  the  rehosted  software  will  execute  in  a  uniprocessor 
environment. 

•  Move  word:  Trap  the  operation  code  and  perform  the  equivalent  move 
operation  with  move  characters  (NVC) .  Alternatively,  all  instances 
of  MW  could  be  replaced  in  the  source  code  with  equivalent  MVC 
Instructions. 

4.2.3  NAS  Monitor 

The  NAS  monitor  uses  all  of  the  special  instructions  except  the  display 
instructions  and  set  PCI  (which  is  not  used  in  the  MAS  software) .  An 
instruction-compatible  computer  can  support  these  special  instructions  in 
many  ways.  It  will  be  important  to  develop  support  for  these  special 
instructions  that  is  consistent  with  their  frequency  of  use  (static  and 
dynamic) ,  their  function  in  a  uniprocessor  environment,  and  their  effect  on 
monitor  performance.  In  addition  to  the  special  instruction  support  already 
described  for  DLY,  LI  and  MW,  the  following  is  one  approach  based  on  the 
FDP  demonstration  [FAAT81]  for  supporting  the  special  instruction  needs  of 
the  MAC  monitor: 
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•  S«t  configuration:  Trap  the  operation  code  and  update  a  virtual 
maory  monitor  table  aa  necessary. 

•  Set  address  translation:  Trap  the  operation  code  and  modify  the 
virtual  memory  page  tables  as  necessary. 

e  Insert  address  translation:  Change  the  operation  code  to  a  unused 
value  to  avoid  the  operation  code  conflict  with  move  characters  long 
(MVCL) .  Trap  the  reassigned  operation  code  and  access  the  virtual 
memory  page  tables  as  necessary. 

e  Load  data  address:  The  instruction  need  not  be  supported  since  this 
instruction  is  unique  to  the  lOCE's  and  the  lOCE  functions  will  be 
replaced  in  the  rehost  system. 

e  Start  lOCEp:  The  instruction  need  not  be  supported  since  the 

baseline  rehost  configuration  would  not  have  lOCE's.  The  MAS  monitor 
would  require  modifications  to  achieve  equivalent  I/O  control  in  the 
baseline  rehost  configuration. 

e  Store  PS  base  resister:  Trap  the  operation  code  and  access  the 
virtual  memory  page  tables  as  necessary. 

e  Load  PS  base  register:  Trap  the  operation  code  and  update  the 
virtual  memory  page  tables  as  necessary. 

Four  of  the  special  instructions  —  DLY>  LOA>  LI  and  MVW  —  function 
somewhat  differently  in  the  IOCS.  These  differences  are  not  expected  to  be 
a  problem  since  the  lOCE  code  would  be  replaced  in  the  baseline  rehost 
configuration  (see  App.  G) . 

4.2.4  Display  Channel 


In  the  rehost  system,  each  mainframe  will  be  capable  of  providing  all  of 
the  processing  needs  of  a  CCC  and  a  display  channel.  The  display  channel 
support  in  a  rehost  system  would  be  based  on  the  9020E  software  since  the 
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Raytheon  730  software  cannot  be  rehoated  on  a  9020  Instcuction'-compatible 
computer.  Note  that  a  rehost  system  Muld  replace  both  types  of  display 
channels  ->  9020B  and  Raytheon  730. 

Host  of  the  9020E  display  channel  software  would  be  reused  in  the  rehost 
system.  The  display  device  and  configuration-dependent  component  of  the 
display  software  will  be  replaced  in  the  rehost  system  since  the  display 
interface  would  be  the  refresh  buffers  attached  to  the  rehost  system 
channels.  The  remainder  of  the  display  software  will  be  reused  without 
modification.  Although  no  static  instruction  usage  data  are  available  for 
the  9020B  software/  the  usage  of  non-standard  instructions  is  expected  to  be 
limited  to  those  used  in  the  NAS  application  software  and  the  four  display 
instructions.  The  display  instructions  can  be  supported  in  many  ways  with 
an  instruction-cosg>atible  computer.  One  approach  would  be  to  change  the 
operation  code  for  RPSB  to  a  unused  value  and  avoid  the  conflict  with  the 
operation  code  for  compare  logical  characters  long  (CLCL)  and  then  trap  the 
operation  codes  for  CSS,  CVNL,  RPSB  and  LC  so  that  the  equivalent  functions 
could  be  emulated.  This  emulation  would  require  careful  investigation  since 
the  display  instructions  were  originally  implemented  for  performance  needs. 

The  monitor  currently  used  in  the  display  channel  based  on  the  9020B  is 
very  similar  to  the  NAS  monitor  used  for  the  CCC.  During  the  rehost 
process,  the  two  monitors  would  be  merged  so  that  only  one  version  of  the 
monitor  would  be  supported  (and  maintained)  and  monitor  code  could  be  shared 
between  the  virtual  processes  in  the  rehost  system. 

4.2.5  Summary 

The  effects  of  rehosting  the  NAS  application  software  have  been  shown  to 
be  minimal.  That  is,  radar  input  processing  would  be  revised  and  the 
remaining  NAS  application  subsystems  would  not  be  changed  as  long  as  the  VM 
monitt>r  is  augmented  to  support  the  9020  special  instructions.  The  display 
channel  software  would  require  modifications  to  accomodate  the  change  from 
display  element  memory  to  the  display  buffers.  The  remainder  of  the  display 
channel  software  is  expected  to  be  reusable  with  the  VM  monitor  providing 
the  equivalent  function  that  the  9020  special  instructions  provide.  The  NAS 
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■onltor  would  rwqulr*  lom  aodifleatlon  as  part  of  the  rahostin^  pcocasa. 
That  is,  thosa  parts  of  the  aonitor  that  support  local  davicas,  error 
analysis,  reconfiguration,  and  startup  trould  require  revision.  The 
remainder  of  the  MAS  aonitor  would  be  reusable  as  long  as  the  VM  monitor 
supports  the  9020  special  instructions. 

Engineering  estimates  for  the  impact  of  rebosting  the  HAS  software  have 
been  prepared  for  cost  estimating  purposes  and  are  listed  in  the  cost 
chapter  (Table  S-1) .  These  derived  software  costs  Include  the  costs  to 
configure  and  augment  the  VM  monitor. 

4.3  9020  Environment 


4.3.1  Introduction 

The  9020  hardware  and  MAS  motuvar  provide  an  operating  environment  for 
the  HAS  application  software.  Part  of  this  environment  is  provided  by  the 
instruction-compatible  computer  and  the  support  for  the  special 
instructions.  The  remainder  rtt  this  environment  would  be  provided  by  a 
combination  of  modifications  to  the  MAS  monitor  and  the  services  provided  by 
a  virtual  machine  monitor.  The  environment  problem  areas  are: 

e  Memory  usage , 

e  Timer  usage  and  synchronization, 

#  Program  status  word  (PSM)  format, 
e  Devices  and  channel  program  usage, 
e  Diagnose  and  error  analysis. 

4.3.2  Memory  Osage 

There  are  several  issues  related  to  memory  usage  in  the  9020  system — 
page  zero,  storage  keys,  immediate  instructions  and  memory  size.  Page  zero 
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(4096  bytes  start in9  at  byte  0)  has  many  permanent  stora9e  assignments  for 
system  functions  such  as  initial  program  loading,  interrupt  processing,  I/O 
initiating,  interval  timer  processing,  and  diagnostic  logging.  In  a 
multiprocessor  computer  system,  it  is  essential  that  this  page  for  each 
active  processor  be  relocated  to  a  unique  memory  location.  While  the 
relocation  is  not  essential  for  a  uniprocessor  environment,  the  NAS  monitor 
assumes  that  the  page  zero  will  be  relocated  and  refers  to  that  relocated 
page  in  an  absolute  manner.  One  approach  for  resolving  this  problem  is  for 
the  virtual  memory  monitor  to  use  the  parameters  of  SPSS  and  LPSB 
instructions  to  modify  the  page  tables  for  the  virtual  memory  associated 
with  the  NAS  monitor. 

Storage  keys  represent  one  mechanism  for  protecting  memory  in 
multiprograsBing  environments  and  they  have  been  used  to  support  the  NAS 
software.  Since  some  storage  (Compool  tables)  is  shared  by  several 
subsystems,  the  protection  mechanism  must  allow  access  to  shared  storage. 

In  the  902Q  system,  one  storage  domain  was  made  accessible  by  all  other 
storage  domains.  This  sharing  feature  is  not  supported  in  any 
instruction-compatible  computer.  The  problem  could  be  resolved  by 
mlcramanaging  the  storage  keys  in  the  virtual  machine  monitor.  In  the  event 
that  operational  experience  with  the  usage  of  storage  keys  represented  a 
performance  problem,  then  the  rehost  computer  hardware  could  be  modified 
(with  an  additional  cost)  to  support  the  NAS  usage  of  storage  keys. 

Three  of  the  immediate  instructions  (and  immediate,  or  immediate,  and 
exclusive  or  immediate)  operate  with  a  fetch,  modify,  and  store  sequence 
that  could  cause  undefined  results  in  a  multiprocessor  configuration  if  one 
processor  ware  to  store  a  value  into  the  same  location  that  another 
processor  had  fetched  a  value  from  but  before  it  bad  stored  the  value.  In  a 
multiprocessor  environment,  it  is  essential  that  these  instructions  execute 
in  an  atomic  manner.  This  atomic  execution  would  not  be  necessary  for  NAS 
application  software  in  a  uniprocessor  environment  since  there  would  not  be 
a  competing  processor.  Interrupt  processing  in  either  multi  or  uni¬ 
processing  environments  could  Interfere  with  the  execution  of  immediate 
instructions.  However,  this  interference  would  represent  a  software  logic 
defect  and  not  a  rehosting  problem. 
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Th«  baseline  cehost  configucation  will  provide  sufficient  physical 
aeaory  so  that  the  NAS  application  software/  the  NAS  aonitor,  and  the 
virtual  machine  monitor  can  remain  memory-resident  at  all  times.  Hencs/ 
disk  buffering  of  PE's  and  flight  plan  tables  as  well  as  the  CB  overhead  for 
the  disk  buffering  will  be  eliminated  and  should  result  in  a  reduction  of 
the  average  response  time  for  ATC  services  that  were  supported  by  buffered 
PE's  amd  tables.  The  potential  improvement  in  average  response  time  may  be 
limited  by  the  internal  queues  for  non-reentrant  PE's  (see  3.3.3). 

4.3.3  Timer  Osage  and  Synchronization 

There  are  several  timing  considerations  that  must  be  analyzed  as  part  of 
the  change  in  NAS  sdftware  environment.  The  NAS  software  needs  both  an 
interval  timer  and  a  time-of-day  (T(N))  timer  which  are  currently  provided  by 
a  60  hertz  decrementer  at  location  80  in  page  zero  and  a  coded  time  source 
(CTS) 0  respectively.  Both  of  these  timer  needs  should  be  supported  by  modi¬ 
fying  the  NAS  monitor  to  directly  access  the  timer  support  provided  by  the 
virtual  machine  monitor.  Simulating  the  current  timers  with  indirect  access 
to  the  virtual  machine  monitor  timer  muld  result  in  accuracy  problems. 

The  differences  in  processor  and  memory  cycle  times  for  the  baseline 
rehost  configuration  as  compared  to  the  current  9020  processor  and  memory 
cycle  times  may  result  in  some  synchronization  problmss  which  have  not  been 
detected  in  the  operation  of  the  NAS  software  in  the  9020A  and  90200 
configurations.  That  these  differences  would  turn  out  to  be  a  problem/ 
however/  seems  remotei  the  rehosting  contractor/  nevertheless/  should  be 
aware  of  the  possibility  that  a  problem  exists. 

Another  synchronization  issue  involves  the  use  of  write  direct  for 
interprocessor  communication  in  a  multiprocessor  configuration.  In  the 
uniprocessor  environment  of  the  baseline  rebost  configuration/  the  write 
direct  may  be  resolved  by  doing  nothing  since  the  current  NAS  monitor  is 
capable  of  operating  with  one  processor.  Alternatively/  the  usage  of  write 
direct  in  the  NAS  monitor  could  be  reviewed  and  the  software  revised  as 
necessary.  The  companion  instruction/  read  direct/  is  not  an  issue  since  it 
is  not  used  in  the  NAS  software. 
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4.3.4  PSH  Format 


Tha  PSW  format  for  tha  9020  aystm  is  vary  similar  to  that  for  tha 
standard  IBM  360  eomputars  with  soma  unusad  bits  in  tha  intarrupt  coda  fiald 
asslgnad  so  that  tha  additional  9020  channals  could  ba  addrassad.  Sinca  tha 
channal  addrass  problam  has  baan  rasolved  diffarantly  for  IBM  370  eomputars 
and  thair  aquivalantSt  a  PSH  format  diffaranca  «»ould  axist  batwaan  tha  9020 
system  and  any  raplacwant  cosputar.  This  difference  can  ba  resolved  in  two 
ways: 

a  Allow  the  virtual  memory  monitor  to  translate  tha  PSH  format. 

a  Modify  tha  MAS  monitor  so  that  all  references  to  the  PSH  use  the 
standard  format.  Tha  changes  would  effect  all  uses  of  load  PSH 
(LPSH)  and  set  system  mask  (S^)  as  well  as  part  of  the  support  for 
the  supervisor  calls  (SVC) . 

4.3.S  Oavicas  and  Channal  Program  Osage 

Sinca  all  of  tha  local  devices  would  ba  replaced  as  part  of  the  baseline 
rahost  configuration,  all  of  the  device  support  routines  and  procaduras  for 
accessing  these  devices  will  aithar  ba  revised  or  replaced.  The  virtual 
machine  monitor  should  provide  all  the  necessary  device  support  routines 
with  minimal  need  for  modifications.  Tha  channel  programs  in  the  I/O 
management  subsystem  and  the  I/O  davica-dapandant  coda  subsystem  would 
require  revision  to  accommodate  tha  new  devices.  In  tha  event  that  channal 
programs  have  baan  embedded  in  other  parts  of  tha  NAS  software,  than  these 
channal  programs  would  have  to  ba  located  and  revised  (or  the  adjacent  code 
modified  to  use  the  standard  I/O  subsystems) .  In  addition,  those  subsystems 
that  are  dependent  upon  device  characteristics  (for  example,  disk  tracks  and 
cylinders  or  tape  density)  would  require  modification. 

The  radar  input  is  currently  supported  within  an  lOCE  using  an  open  loop 
channel  program.  In  the  rehost  system,  the  radar  input  would  be 
preprocessad  by  the  radar  input  line  multiplexor  (see  App.  G)  and  presented 
in  a  blocked  record  format  to  the  mainframe. 


The  intaxfac*  b«tw««n  the  malnfraae  in  the  baseline  cehost  configuration 
and  the  display  generators  would  be  a  display  buffer  (see  App.  G) .  The 
refresh  buffer  would  be  capable  of  supporting  the  high  data  transfer  rates 
required  by  the  display  generator  and  allow  periodic  access  by  the  awinfrasM 
to  update  the  display  data. 

4.3.6  Diagnose  and  Error  Analysis 

The  diagnose  instruction  provides  assistance  in  sorting  out  hardware 
probleas.  The  functional  operation  and  data  values  returned  for  this 
instruction  differs  for  nearly  every  inatruction-cosqpatible  computer  and 
even  for  the  9020A  and  9020D  cemputers.  Diagnose  is  an  important  part  of 
the  element  error  analysis  and  configuration  and  the  I/O  error  analysis 
subsystems.  Hence >  these  monitor  subsystems  would  require  careful  analysis 
and  revision  not  only  to  incorporate  the  rehost  version  of  diagnose  but  to 
accommodate  the  baseline  rehost  configuration  which  is  significantly 
different  from  the  current  9020  configuration. 

Another  aspect  of  the  error  analysis  is  the  requirement  that  the 
operation  of  the  system  be  resumed  as  soon  as  possible  after  a  failure  has 
been  detected  and  resolved.  An  essential  part  of  resumed  operations  is  to 
provide  a  valid  copy  of  critical  data  values  without  resorting  to  complete 
reconstruction  in  the  event  of  a  detected  compromise  to  the  active 
database.  In  the  present  system,  critical  data  values  are  written  to  disk, 
on  a  periodic  basis  (30  second  interval)  to  facilitate  database  restoration. 

In  the  rehost  system,  the  reconfiguration  process  in  the  event  of  a 
failure  would  result  in  the  transfer  of  active  status  to  the  "stand-by" 
processor.  This  transfer  of  status  would  be  completed  within  2  to  5  seconds 
since  all  of  software  in  the  "stand-by"  processor  would  always  be 
initialised  awaiting  access  to  the  most  recent  set  of  critical  data  values. 
Mote  that  a  complete  initiation  of  the  rehost  system  starting  with  the 
initial  program  load  for  the  virtual  machine  monitor  would  require  at  least 
S  minutes. 
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In  ocdec  to  b«tt«r  support  the  csconfiguration  pcoesss  in  tho  rehost 
systea>  the  critical  data  values  should  be  saved  aore  frequently,  perhaps 
with  a  3  to  S  second  interval.  In  addition,  the  range  of  critical  data 
values  saved  should  be  reviewed  in  order  to  identify  additional  data  values 
that  would  allow  faster  resuaption  of  autoaated  processes. 

4.3.7  Support  Software 

The  software  support  tools  for  the  MAS  systea  should  be  reusable  in  the 
rehost  environaent.  Only  the  perforaance  aonitoring  tools  would  require 
changes  to  reflect  differences  in  tiaer  support  and  device  configurations. 

In  particular,  the  high  resolution  tiaer  (HRT)  tool  would  no  longer  require 
a  dedicated  processor  in  order  to  generate  high  resolution  tiaer  intervals. 

4.3.8  Suaaary 

This  section  has  considered  a  range  of  issues  relating  to  the  9020 
environaent.  The  conclusion  is  that  these  issues  can  be  resolved  within  the 
context  of  the  VM  aonitor  and  the  interaction  between  the  HAS  aonitor  and 
the  VM  aonitor. 

HBFSRSHCBS 

[FAATSl]  FAA  Technical  Center,  Inforaal  discussion  with  ACT-TOO  staff  about 
their  experiences  in  rehosting  the  Flight  Data  Processing  coaponent  of  the 
MAS  application  software  on  an  IBM  4341  counter  systea  operated  with  the 
VM/370  operating  systea.  Atlantic  City,  NJ,  Nay  1981. 

(iaN73]  IBM  90200  and  9020E  Systea  Principles  of  Operations,  January  1973. 

[IBM75]  IBM  SPAR  68,  A302.2  RBB  Instruction  Sequence  Scan  Report,  August, 
1975. 
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5.  COST 


5.1  Introduction 


The  purpose  of  this  chapter  is  to  estimate  the  cost  of  rehosting  the 
HAS  software  on  instruct ion-cnopatible  machines.  This  cost  covers  the 
development,  acquisition,  and  operation  of  the  new  system.  In  estimating 
this  cost  six  principles  are  followed. 

First,  the  goal  is  to  estimate  the  cost  of  instruction-compatible 
replacement  relative  to  the  coat  incurred  under  the  status  quo.  That  is, 
the  baseline  against  which  cost  is  measured  is  the  hypothetical  situation  in 
«riiich  the  current  system  continued  to  operate  through  the  rest  of  this 
decade.  In  other  viords,  what  is  estimated  is  the  change  in  the  cost  of 
providing  en  route  air  traffic  control  services  that  would  result  if 
instruction-compatible  replacement  were  adopted. 

Second,  because  the  rehosting  problem  is  not  completely  understood,  the 
estimates  in  this  chapter  should  be  thought  of  as  first  approximations 
rather  than  as  definitive.  The  goal  is  to  give  plausible  estimates  of  what 
the  cost  of  rehosting  might  be,  but  further  study  would  be  needed  before  one 
could  have  a  high  level  of  confidence  in  the  cost  estimates.  The  FAA 
personnel  who  provided  the  basic  information  used  in  this  chapter  operated 
under  the  understanding  that  what  was  needed  was  a  reas'<nable  first 
approximation  and  that  they  would  be  contacted  again  if  a  more  accurate 
approximation  was  needed. 

Third,  a  conservative  approach  is  used  in  estimating  the  costs  to  malce 
sure  that  the  cost  of  rehosting  is  not  underestimated;  whenever  there  is 
doubt  about  a  particular  cost,  a  higher  figure  is  chosen.  Therefore,  the 
cost  estimated  in  this  chapter  can  be  thought  of  as  an  upper  bound;  effort 
has  been  made  to  make  this  upper  bound  as  tight  as  possible. 

Fourth,  the  procedure  followed  in  this  section  is  to  spell  out  the 
basic  data  and  the  assumptions  that  are  used  to  produce  the  cost  estimates. 
It  is  not  claimed  that  the  data  and  assumptions  are  precise  and  perfect;  all 
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that  is  claimed  is  that  the  data  and  assumptions  used  reflect  our 
understanding  of  the  problem  at  the  time  this  report  was  written.  Every 
effort  has  been  made  to  make  clear  tdtat  data  and  assumptions  are  used;  the 
reader  who  has  better  assumptions  or  data  should  have  no  trouble  with 
re->doing  the  calculation  and  producing  his  own  estimates. 

Fifth,  it  is  assumed  that  the  computers  are  replaced  at  all  twenty 
centers.  Appendix  F  discusses  the  case  where  replacement  only  occurs  at 
some  of  the  centers. 

Sixth,  all  cost  estimates  are  in  1981  dollars.  Mo  attempt  has  been 
made  to  estimate  how  these  costs  will  change  over  time. 

If  instruction-compatible  replacement  were  undertaken,  the  change  in 
the  coat  of  providing  en  route  air  traffic  control  services  would  fall  into 
five  broad  categories: 

e  software:  this  includes  the  development  and  testing  of  new 
software  and  its  integration  with  the  old  software  and  the  new 
hardware: 

e  hardware:  this  iiMludes  the  development,  testing,  and  acquisition 
of  the  new  hardware: 

#  maintenance  cost:  this  includes  the  expenditure  on  personnel  and 
parts  made  in  order  to  maintain  and  support  the  system  once  it  is 
in  operation; 

a  transition  cost:  this  includes  the  cost  of  remodeling  needed  to 
prepare  the  site  for  installation,  of  special  hardware  needed  only 
Cor  the  transition  period,  and  of  training  and  other  personnel 
costs. 

e  Program  management  and  support  cost:  this  includes  the  cost 
incurred  by  the  FAA  in  administering  the  procurement. 
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Each  of  tha  costs  will  now  b«  discussed. 

5.2  Software  Cost 


The  nain  advantage  of  replacing  the  9020 *s  with  instruction-compatible 
oacblnes  is  that  the  current  RhS  software  would  then  be  used  on  the  new 
machines,  and  a  wholesale  rewriting  of  the  software  could  be  avoided. 
Nevertheless,  some  changes  in  the  current  software  would  be  needed  for  the 
reasons  discussed  in  Chapter  4,  including  the  need  to  analyze  and 
reconfigure  new  hardware  when  there  is  a  failure,  to  handle  new  peripherals, 
and  to  deal  with  instruction  set  differences,  hn  estimate  of  the  money  and 
time  needed  to  carry  out  these  changes  in  the  NAS  software  will  now  be  given. 

Table  S-1  shows  the  NAS  software  subsystems  and  the  size  (in  words)  of 
each.  HH  Aerospace,  after  studying  the  rehosting  problem,  has  estimated 
both  the  percentage  of  the  words  of  code  in  each  module  that  would  be 
affected  by  rehosting  and  also  the  difficulty  involved  in  dealing  with  this 
code)  these  estimates  are  shown  in  Table  5-1.  It  should  be  stressed  that 
while  the  ai^lication  code  will  be  affected,  it  is  not  ezpected  that  it  will 
be  changed)  the  problems  described  in  Ch.  4  will  be  taken  care  of  by  some 
method  other  than  changing  the  application  code.  While  this  code  will  not 
be  changed,  it  will  have  to  go  through  testing  and  integration.  Not  shown 
in  Table  5-1  are  the  changes  that  must  be  made  to  the  virtual  machine 
monitor  to  adapt  it  to  the  baseline  rehost  configuration  and  to  modify  it  to 
support  the  handling  of  the  non-standard  instructions.  It  is  estimated  that 
25,000  of  the  500,000  words  in  the  virtual  machine  monitor  would  need  to  be 
redesigned  and  recoded. 

Software  development  and  testing  cost  has  been  estimated  with  the  PRICE 
S  software  cost  estimation  model.  The  estimation  has  been  carried  out  by 
The  Analytic  Sciences  Corporation  (TASC)  and  is  documented  in  a  forthcoming 
report  [TASCSl] )  the  reader  is  referred  to  that  report  for  details.  Table 
5-2  shows  some  of  the  assumptions  used  and  Table  5-3  shows  how  the  code  is 
classified. 
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TABLE  5-1:  ESTIMATED  PERCENTAGE  OP  THE  NAS  SOPTWARE  APPECTED  BY  REHOSTING 


SubsvstMi 

Size 

Affected 

Difficulty 

A.  CCC  Application  Code 

Pccliainary  proceaaing 

IS, 692 

20% 

average 

Plight  data  proceasing 

53,212 

10% 

average 

Route  conversion 

36,462 

10% 

average 

Disk  storage  applications 

22,732 

20% 

difficult 

Posting  deteriBination 

33,260 

10% 

average 

Plight  status  alerts 

46,872 

10% 

average 

Inquiry  processing 

62,296 

10% 

average 

Supervisory  and  interfacility 

36,648 

10% 

difficult 

Hardware  error  processing 

6,934 

20% 

difficult 

Track  data  processing 

41,259 

10% 

average 

Display  channel  outputs 

48,652 

20% 

difficult 

Real-time  quality  control 

6,032 

10% 

average 

Radar  processing  and  tracking 

31,400 

50% 

difficult 

Plight  plan  analysis 

2,562 

10% 

average 

B.  Monitor  Code 

Startup/atartover  management 

5,260 

50% 

difficult 

Element  error  analysis  and  config. 

10,980 

100% 

very  difficult 

Input/output  management 

4,046 

20% 

difficult 

Input/output  error  analysis 

2,566 

100% 

difficult 

Program  element  control 

1,256 

20% 

average 

Program  element  synchronization 

1,266 

30% 

difficult 

Storage  and  communication  ngrnt. 

2,432 

20% 

difficult 

Nan-machine  communication 

17,096 

20% 

difficult 

On-line  data  recording  services 

6,900 

50% 

difficult 

On-line  test  tools 

18,688 

50% 

difficult 

Contents  supervisor 

10,145 

20% 

difficult 

Input/output  device  dependent  code 

11,542 

100% 

very  difficult 

C.  Miscellaneous  Code 

DCC 

78,000 

50% 

difficult 

CDC 

48,700 

0% 

DARC 

53,000 

0% 

NDM 

2,000,000 

0% 

NOSS 

323,000 

5% 

average 

Op  support 

1,000,000 

0% 

SouccBi  SiM  -  (PDSSOf  S«c.  5.1.1] 


TABLE  5-2 t  SOFTt(ABE  OEVELOPMENT  COST  ASSOMPTIONS 


CHARACTERISTIC 


Type  of  System 


Hardware  Effects  on  Software 


Labor  Costa 


Types  of  Software 


Secondary  Costs 
Escalation 

Integration  of  Software 
Into  System  Level 
Configuration 


ASSOMPTIONCS) 

HIL-SPEC  ground-based  aircraft  control 
system. 

Capacity  problems  not  anticipated. 

Response  time  problems  not  anticipated. 

Software  can  support  all  hardware 
Interfaces  In  system. 

Per  man-month  labor  costs  are  taken  to  be: 

$6968  (design) 

$5796  (Implementation) 

$5829  (test  &  integration) 

The  software  to  be  developed  consists  of: 

application  software  (average 
difficulty) 

application  software  (difficult  to 
develop) 

monitor  software  (average  difficulty) 

monitor  software  (difficult  to  develop) 

monitor  software  (very  difficult  to 
develop) 

MOSS  software  (average  difficulty) 

15%  of  labor  costs 

All  costs  in  constant  1981  dollars 

Typical  level  ot  integration  effort 
anticipated. 


Source:  [TA8C81] 


TABLE  S-3i  CLASSIFICATION  OF  THE  CODE  TO  BE  MODIFIED 


NAS 

SOFTWABE  TYPE 

Application 

Application 

Monitor 

Monitor 

Monitor 


LEVEL 

OF  DIFFICOLTY 

Average 

Difficult 

Average 

Difficult 

Very  Difficult 


MOSS  Average 

Source:  [TASCSl] 


PRICE  S 

APPLICATION  CLASS 


Real-Time  Command  a  Control 

Interactive  Operations 

Operating  Systems 

Operating  Systems 

Special  class  with  an  application 
class  value  10%  greater  than  the 
Operating  Systems  value. 

Data  Storage  and  Retrieval 


TABLE  5-4:  ESTIMATES  OF  THE  SOFTWARE  DEVELOPMENT  AMD  TESTING  COST 


Category  of  Software 

Estimated 

Low 

Cost  (millions) 

Best 

High 

On-line 

$2,520 

$3,330 

$3,650 

VH  modification 

1.540 

2.070 

2.250 

Support 

0.330 

0.420 

0.450 

Total 

$4,390 

$5,820 

$6,350 

Source:  [TASCSl] 
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Table  5-4  shows  that  the  estlsMte  of  the  software  developsMnt  and 
testing  costs  ranges  from  S4.390  ailllon  to  $6,350  nillion,  depending  on  the 
assuBptions  made.  The  beat  eatisMte  is  $5,820  ailiion.  This  table  breaks 
the  cost  down  into  the  coat  of  changing  the  on-line  software,  the  support 
software,  and  the  virtual  aachine  aonitor.  This  cost  estiaate  covers  the 
developaent,  testing,  and  integration  of  the  new  software;  once  this  process 
is  coapleted,  the  systea  is  ready  to  be  installed  and  tested  at  the  FAA 
Technical  Center. 

5.3  Hardware  Cost 

The  hardware  coat  that  would  be  incurred  under  rehosting  falls  into  the 
categories  of  aainfraae  cost,  peripherals  cost,  special  hardware  cost,  and 
system  testing  cost.  Bach  of  these  will  now  be  discussed.  These  costs  are 
drawn  aainly  from  a  forthcoaing  ThSC  report  [TASC81] . 

Mainframe  coat.  The  two  leading  aainfraaes  that  are  candidates  for 
rehosting  are  the  Amdahl  470/V7  and  the  IBM  30330.  Table  5-5  shows  the  cost 
of  the  mainframe  and  associated  hardware  that  would  be  borne  if  there  were 
rehosting.  This  table  and  the  next  are  based  on  list  prices.  Bach 
aainframe  is  assumed  to  have  8  megabytes  of  memory  and  12  channels.  The 
cost  for  each  center  is  estimated  to  be  about  $4.3  ailiion  if  the  470/V7  is 
selected  and  about  $6.6  million  if  the  30330  is  selected.  (If  the  470/V8 
rather  than  the  470/V7  were  selected,  the  added  cost  at  each  site  would  be 
about  $300,000.) 

One  advantageous  aspect  of  rebosting  should  be  pointed  out.  Since 
off-tbe-shelf  mainframes  are  used,  this  means  that  the  processor  can  be 
upgraded  if  this  proves  desirable.  For  example,  a  V7  can  be  field-upgraded 
to  a  V8  at  a  coat  of  $250,000  [AMDA81] ;  this  yields  an  increase  in  processor 
capacity  which  is  estimated  by  one  source  to  be  7  percent  IL1AS80,  p.  104] 
and  by  another  to  be  23  percent  (HEMK81,  p.  14] .  The  ease  of  upgrading 
means  that  the  FAA  can  avoid  being  poshed  into  overbuying  by  the  uncertainty 
over  bow  much  processor  capacity  is  needed. 


TABLE  5-5 i  MAIMFIIAMB  ACQOISITXOH  COST  AT  EACH  CEMIER 


Aadahl 

Onit  Price  Humber 

Total 

470/V7* 

$2,125,000 

2 

$4,250,000 

Channel  to  Channel 

Adapter 

32,500 

2 

65,000 

Two  byte  Interface 

1,400 

2 

2.800 

Total 

$4,317,800 

IBM 

30330  AOS 

$2,376,000 

2 

$4,752,000 

Extended  Addreeslng 

93,900 

2 

187,800 

Extended  Control  Store 

24,800 

2 

49,600 

Data  Streaming 

40,000 

2 

80,000 

3033  Extension 

35,000 

2 

70,000 

MPD 

287,000 

2 

574,000 

Power/Coolant  Onit 

228,000 

2 

456,000 

Console 

192,000 

2 

384.000 

Total 

$6,553,400 

*price  includes  power #  cabinet. 

and  console 

M.B.  Each  aainfraae  is  assumed 
channels. 

to  have  8  imgabytea  of 

main 

memory  and  12 

Source I  (TASC811 
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TABLE  5-<t  BERIPHBBALS  ACQtIISITI(»  COST  AT  BACH  CEtRER 


Onit  Price (S) 

Huaber 

Total 

Ma^.  Taps  3240 

24,190 

4 

$96,760 

Nag.  Tape  3803 
Controller 

38,815 

2 

73,630 

Dialt  3350 

40,000 

4 

160,000 

Disk  3380 

Controller 

97,650 

2 

195,300 

Line  Printer 

51,130 

2 

102,260 

L.P.  Controller 

17,685 

2 

35,370 

I/O  Switch 

79,620 

2 

159.240 

Total 

$822,560 

Sourest  {TASC81] 
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RIN  Line  Multiplexor 
Refresh  Buffer 
Total 

Source:  [TASC81] 


$150,000 

150.000 

$300,000 


TABLE  5-8:  SPECIAL  HABONARE  ACQOISITION  COST  PER  CENTER 


Special 

Onit 

Units  per 

Total  per 

Hardware 

Coat 

Center 

Center 

RIM  Line  Multiplexor 

$  3,500 

25 

$  87,500 

Refresh  Buffer 

10,000 

15 

150,000 

Cabinet,  Power 

and  Connectors 

1,000 

2 

2.000 

Total 

$239,500 

Source:  [TASC81] 


TABLE  5-9:  SUMMARY  OF  THE  HARDNARE  COST 


Engineering  cost 
Acquisition  cost  per  center* 

AMlahl  470/V7 
IBM  30330 
Systea  testing 

*  includes  acquisition  of  aainfraae 
Source:  Tables  5-5  through  5-8 


$  300,000 

5,379,860 
or  7,615,460 
6,300,000 

peripherals,  and  special  hardware 
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P«rtPhTals  cost.  Tabl*  5*6  shows  that  $0.8  million  is  the  cost  at 
each  centac  of  replacing  the  magnetic  tape  units,  the  disk  units,  the  line 
printer,  and  the  control  units. 

Special  hardware  cost.  The  special  hardware  required  by  rehosting  is 
the  RIM  line  multiplexor  and  the  refresh  buffer  as  described  in  App.  G.  For 
each  piece  of  special  hardware  there  is  a  one-time  engineering  cost  for  the 
system  that  cowers  design,  development,  testing,  and  software.  This 
one-time  engineering  coat  is  estimated  to  be  $300,000,  as  Table  5-7  shows. 
The  special  hardware  acquisition  cost  for  each  center  is  estimated  to  be 
$239,500.  Table  5-8  shows  the  number  of  units  needed  at  each  center  and  the 
unit  cost. 

System  testing  coat.  After  the  hardware  and  software  are  developed  and 
tested  by  the  contractor,  the  FAA  will  test  the  complete  system  at  the  FAA 
Technical  Center  and  then  at  an  an  route  center.  For  convenience,  this  cost 
is  included  here.  The  testing  process  is  expected  to  take  15  months  (see 
Ch.  7)  and  to  cost  $5.0  million  per  year,  a  figure  provided  by  the  FAA. 
Therefore,  the  total  system  testing  cost  is  estimated  to  be  $6.3  million. 

Summary.  The  hardware  cost  is  summarised  in  Table  5-9.  The 
engineering  cost  for  the  special  hardware,  which  is  incurred  only  once  for 
the  system  as  a  whole,  is  estimated  to  be  $300,000.  The  cost  of  acquiring 
the  mainframe,  peripherals,  and  special  hardware  for  each  center  is 
estimated  to  be  either  $5,380  million  (if  470/V7's  are  procured)  or  $7,615 
million  (if  3033a's  are  procured).  The  system  testing  cost,  which  is 
incurred  once,  is  estimated  to  be  $6,300  million. 

5.4  Maintenance  Cost 


5.4.1  Introduction 

Maintenance  and  support  of  the  9020 's  is  carried  out  by  personnel  at 
the  ABTCC's,  the  FAA  Technical  Center,  and  the  FAA  Depot.  If  there  la  a 
failure,  the  problem  is  diagnosed  by  personnel  at  the  ARTCC,  perhaps 
assisted  by  Technical  Center  personnel.  If  the  failure  is  a  hardware 
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falluc*>  the  faulty  part,  once  located,  is  replaced.  It  is  repaired  at  the 
center  or,  if  the  repair  is  cos^lex  and  the  part  costs  sore  than  $300,  it  is 
sent  to  the  Depot  under  the  exchange  and  repair  prograa.  The  Depot  then 
sends  a  good  part  froa  its  stock  to  the  hSTCC  and  repairs  the  faulty  part  if 
possible  and  adds  it  to  its  stock.  The  Depot  is  responsible  for  providing 
virtually  all  spare  parts  to  the  centers. 

This  aalntenance  strategy  aight  be  changed  in  ttm  ways  under 
rehosting.  First,  the  FM  aight  find  it  advantageous  to  diagnose  probleas 
by  using  a  telephone  link  with  a  rea»te  diagnostic  center.  Second,  because 
of  the  use  of  large  scale  integration,  it  is  not  really  feasible  for  the  Fhh 
to  repair  failed  cards;  repair  of  these  cards  would  require  very  elaborate 
facilities  and  would  probably  be  done  by  the  aanuf acturer .  Nevertheless, 
for  purposes  of  estiaation,  it  is  assuaed  that  the  rehost  systea  Is 
aaintained  in  the  saae  way  as  the  current  systea. 

The  aaount  by  which  the  aalntenance  and  support  cost  would  change  under 
rehosting  will  now  be  estiaatad.  This  cost  is  divided  into  the  t%K> 
categories  of  personnel  and  parts. 

5.4.2  Personnel 

The  expected  change  in  the  ie<ges  paid  to  aalntenance  and  support 
personnel  will  now  be  estiaated;  costs  of  training  these  personnel  are 
considered  under  transition  cost  in  Sec.  5.5.  Since  the  changes  aade  to  the 
software  would  not  greatly  alter  its  else  or  structure,  it  is  assuaed  that 
there  would  be  no  change  in  the  software  aalntenance  cost. 

It  is  expected  that  there  will  be  a  decrease  in  the  hardware 
aalntenance  cost  because  of  the  technological  progress  that  has  occurred 
since  the  9020 's  were  purchased.  Not  only  is  aodern  hardware  nore  reliable, 
as  Ch.  2  points  out,  but  when  there  is  a  failure  it  is  easier  to  diagnose 
and  repair.  Also,  since  the  replaceaent  systaa  will  have  fewer  coaponents 
than  the  old,  there  will  be  a  lesser  need  for  specialists. 
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TABLE  5-10 t  AMNOAL  BBOOCTIOM  IM  THE  COST  OF  HAEDNAEB  MAIMTENAMCE 
PEBSOmiBL  AT  A  TYPICAL  ABTCC 

Y€af  Raductton 

First 
Sscond 
Third 

Fourth  and  la tar 


$  0 

137,431 

274,861 

412,292 
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Th«  magnitude  of  the  reduction  in  the  personnel  cost  of  hardware 
maintenance  is  estimated  in  the  following  way.  A  recent  report  commissioned 
by  the  FAA  eatimatea  that  at  a  typical  ARTCC  there  are  the  equivalent  of 
33.9  full-time  Airway  Facilities  Service  personnel  working  on  automation 
[ASI80,  p.  4-4].  Assume  that  the  average  grade  is  a  68-13,  step  4,  with  an 
annual  salary  of  $35,252}  increase  this  by  15  percent  to  $40,540  to  cover 
benefits  and  overtime.  This  gives  a  government  outlay  of  $1,374,306  at  a 
typical  ABTOC. 

It  is  assumed  that  during  the  first  year  that  an  ARICC  has  the  new 
system,  there  will  be  no  reduction  in  the  cost  of  hardware  maintenance 
personnel  because  of  the  frictions  of  transition.  It  is  assumed  that  there 
is  a  10  percent  reduction  in  each  of  the  second,  third,  and  fourth  years} 
therefore,  the  long-term  reduction  is  30  percent.  (The  Airway  Facilities 
Service  has  stated  that  30  percent  la  a  reasonable  figure.)  The  dollar 
amounts  that  would  be  saved  per  center  each  year  are  shorn  in  Table  5-10. 

The  long-term  reduction  of  30  percent  is  chossn  since  it  is  a 
conservative  figure  that  appears  not  to  overestimate  the  savings  that 
rehosting  would  provide.  Though  the  number  of  hardware  failures  is  expected 
to  fall  by  40  to  90  percent,  this  lower  figure  of  30  percent  is  chosen  for 
two  reasons.  First,  the  reduction  in  personnel  is  less  than  the  reduction 
in  failures  because  of  the  need  for  specialists.  For  example,  even  if  there 
is  a  greatly  reduced  number  of  memory  failures,  it  is  still  necessary  to 
have  a  specialist  who  can  deal  with  memory.  Second,  the  figure  of  33.9 
full-time  personnel  is  slightly  too  large  since  it  includes  AF  personnel  who 
maintain  the  software  in  the  display  channel,  which  is  not  relevant. 

Since  the  number  of  relevant  hardware  maintenance  personnel  at  the 
Technical  Center  and  the  Depot  is  small,  no  reduction  in  cost  at  these 
organisations  is  estimated.  There  are  about  25  AF  personnel  at  the 
Technical  Center.  The  chief  of  the  Engineering  and  Production  Branch  at  the 
Depot  estimates  that  the  equivalent  of  between  2  and  3  full-time  technicians 
work  on  9020  parts  in  the  exchange  and  repair  program. 
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5.4.3  Pacts 


Tbs  ehan9a  in  tha  cost  of  replacasMnt  parts  that  t»uld  result  team 
cabosting  can  be  divided  into  the  start-up  costs  and  the  annual  cost. 

Start-up  cost.  Information  on  the  start-up  costs  that  would  be  incurred 
in  laying  in  an  initial  inventory  of  replacement  parts  and  in  meeting  other 
front-end  requirements  was  provided  by  the  chief  of  the  NAS  Project  and 
Provisioning  Section  of  the  Depot.  These  costs  ace  expressed  as  a 
percentage  of  the  hardware  acquisition  cost  for  a  single  center.  These 
percentages  are  based  on  rules  of  thumb  and  on  experience  with  other 
systems,  not  on  a  study  of  the  rehosting  problem.  Therefore,  these 
percentages  should  be  thought  of  only  as  first  approximations.  There  ace 
start-up  costs  both  for  the  Depot  and  for  each  center. 

For  the  Depot  there  are  two  start-up  costs.  First,  6  percent  of  the 
hardware  cost  at  one  site  is  assumed  to  cover  (a)  documentation  on 
engineering  and  provisioning,  including  engineering  drawings  and  all  other 
engineering  specifications  (as  required  by  FAA-(»-1210d  [PAA78]);  (b) 
training  on  how  to  use  the  testbed;  (c)  development  of  equipment  to 
troubleshoot  the  system.  Second,  10  percent  is  needed  to  purchase  a 
testbed,  which  is  the  hardware  needed  to  test  pacts  that  have  been  repaired 
to  insure  that  the  repairs  have  been  made  properly.  Tbe  start-up  cost  at  the 
Depot,  then,  is 

$5,380  million  x  0.16  -  $0,861  million, 
if  V7'x  ace  procured  or 

$7,615  million  x  0.16  -  $1,218  million 
if  30330* a  ace  procured. 

Poc  each  center  there  is  a  start-up  cost  of  24  percent.  This  goes  for 
spare  parts,  some  of  which  are  stocked  at  the  center  and  some  at  the  Depot. 
This  figure  breaks  down  into  4  percent  for  pacts  common,  i.e.,  pacts  that 
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can  b*  ordaced  from  a  vendor's  catalog,  and  20  percent  Cor  parts  peculiar, 
i.e.,  parts  that  are  not  parts  cosnon.  Therefore,  the  start-up  cost  at  each 
center  is 


t 

i 

I 

$5,380  million  x  0.24  >  $1,291  million 

if  V7's  are  procured  or 

$7,615  million  x  0.24  -  $1,828  million  ' 

if  30330' s  are  procured.  Therefore,  the  initial  stock  of  spare  parts  for 
all  20  centers  is  $25,820  if  V7's  are  procured  and  $36,560  if  3033O's  are 
procured.  It  should  be  stressed  that  this  figure  of  24  percent  is  very 
conservative.  Moreover,  it  does  not  take  into  account  the  dramatically 
improved  reliability  of  the  new  system  (as  discussed  in  Ch.  2).  Therefore, 
it  is  probable  that  the  estimate  of  the  initial  spare  parts  cost  is  much  too 
high. 

Annual  cost.  According  to  the  chief  of  the  General  Materiel  Section  of 
the  Depot,  the  FAA  has  a  contract  with  IBM  under  which  the  FAA  buys  the 
replacement  parts  needed  for  the  9020A,0,  and  E  systems.  For  the  last  few 
years  the  annual  cost  of  the  spare  parts  for  these  systems  has  hovered 
around  $950,000.  Similar  information  on  the  cost  of  replacement  parts  for 
the  Raytheon  730  is  not  available,  so  the  following  rough  approximation  will 
be  used.  There  are  25  9020  A,0,  and  E  systems  and  15  Raytheon  730  systems. 

Therefore,  assume  that  the  cost  of  parts  for  the  730 's  is  15/25  that  of  the 
cost  of  parts  for  the  9020 's,  i.e.,  $570,000.  This  means  that  the  total 
annual  cost  of  replacement  parts  for  the  CK's  and  display  channels  is 
$1,520,000.  (This  cost  is  estimated  to  be  $2.3  million  in  a  report  on 

e 

maintenance  cost  prepared  for  the  FAA  (ASI80,  p.  6-6].  This  estimate, 
however,  seems  to  be  based  on  less  reliable  information,  and  it  is  not  used 
here.) 

It  is  assumed  that  with  a  new  system  the  annual  expenditure  on  parts 
will  fall  by  two-thirds,  i.e.,  by  $1,013,333.  Even  though  the  actual  parts 

84 


4 


usag*  will.  It  is  thought,  fall  by  soaawhat  aoca  than  two-thirda  ,  this 
lowsc  figuca  is  usad  to  saka  sura  that  tha  saving  is  not  ovarstated  and  to 
allow  for  tha  possibility  that  tha  currant  parts  usaga  of  tha  Raythaon  730 
has  baan  ovarstatad. 

Tabla  S-11  sunsarizas  tha  affect  that  rahosting  would  have  on  tha 
expenditure  on  cost.  Initially  there  would  be  a  ona-tine  cost  of  either 
$26,681  aillion  if  V7's  are  procured  or  $37,778  if  30330*8  are  procured} 
this  would  lay  in  a  stock  of  raplaceaant  parts  and  set  the  Depot  up  so  it 
could  deal  with  the  new  systaa.  There  %fOuld,  however,  be  a  saving  of  $1,013 
aillion  each  year  for  tha  systoi  because  the  new  systaa  would  require  fewer 
replaceaant  parts. 

5.4.4  Suaaary  of  tha  Estiaatad  Annual  Savings  in  Maintenance  Cost 

Tha  tiaa  profile  of  tha  savings  on  personnel  and  parts  will  now  be 
considered.  It  is  asauaed  that  the  new  systaa  will  go  into  operation  at  the 
20  ABTOC'a  over  a  period  of  2  years  (see  the  procureaent  schedule  in  Ch. 

7).  Therefore,  it  is  here  assuaed  that  10  systeas  go  into  operation  in  the 
first  year  and  10  in  the  second.  The  yearly  savings  are  shown  in  Table  5-12 
and  will  now  be  explained. 

Personnel  cost.  Tor  any  one  center  it  has  been  assuaed  that  the  cost  of 
hardware  aaintenance  personnel  will  not  change  in  the  first  year  of  the  new 
aystea  but  will  then  decline  by  10  percent  each  of  the  next  three  years  for 
an  eventual  annual  saving  of  30  percent  of  the  estiaated  current  figure  of 
$1,374  aillion  per  year.  Consider  the  10  systeas  installed  in  the  first 
year.  There  is  no  saving  in  the  first  year.  The  saving  in  the  second  year 
is  $1,374  aillion,  i.e.,  $1,374  aillion  x  10%  x  10  centers.  The  savings  in 
the  third  and  fourth  years  are  then  $2,748  aillion  and  $4,122  aillion.  For 
the  10  systeas  installed  in  the  secmid  year,  the  savings  are  also  $1,374, 
2.748,  and  4.122  aillion,  realized  in  the  third,  fourth,  and  fifth  years 
re^ectively.  These  two  streaas  are  then  added  together  to  obtain  the  total 
personnel  saving  per  year,  which  is  shown  in  Table  5-12. 
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Parta.  Since  the  annual  saving  in  parts  with  the  new  system  is 
estimated  to  be  about  $1,013  million  per  year,  half  this  amount  is  saved  the 
first  year  when  half  the  systems  are  in  operation,  and  the  full  $1,013 
million  is  saved  in  subsequent  years.  These  figures  are  shown  in  Table  5-12. 

5.5  Transition  Cost 


Transition  cost  covers  all  the  costs  that  are  incurred  because  of  the 
switch-over  to  a  new  system  and  can  be  divided  into  four  categories: 
remodeling  cost,  special  hardware  cost,  extra  personnel  cost,  and  training 
cost.  Each  of  these  categories  will  now  be  discussed. 

Remodeling  cost.  If  the  broadband  is  removed  from  the  centers  in  1984 
as  planned,  then  there  should  be  sufficient  floorspace  to  comfortably  house 
the  old  and  new  systems  simultaneously;  this  means  that  no  major 
construction  would  be  needed  [MOLLSl,  pp.  39-40] .  The  cost  of  remodeling  is 
estimated  by  the  FAR  to  be  one  million  dollars  per  center. 

Special  hardware  cost.  There  will  be  a  need  for  hardware  that  will  be 
thrown  away  after  the  transition  period,  e.g.,  extra  cables  and  switches. 

It  is  expected  that  this  would  be  minor,  so  no  cost  is  assigned. 

Extra  personnel  cost.  This  cost  refers  to  the  extra  person.'^el  that 
might  be  needed  to  help  make  the  transition  to  the  new  system.  These  extra 
personnel  might  be  needed  just  before  replacement  when  the  heavy  training 
schedule  has  temporarily  depopulated  a  center.  The  extra  personnel  might 
also  be  needed  during  the  period  of  parallel  operation;  with  two  different 
systems  operating,  the  center's  normal  staff  might  be  overtaxed.  (This 
topic  is  further  discussed  in  Chapter  €) .  It  would  be  premature  to  specify 
how  the  transition  would  be  made  and  bow  many  extra  personnel  vould  be 
needed.  To  serve  as  a  round  figure  representing  the  cost  of  extra 
personnel,  $200,000  per  center  is  chosen. 
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TABLE  5-11:  CHANGE  IN  THE  EXPBMDZTaRB  ON  REPLACEMENT  PARTS 
DOB  TO  REHOSTING 


One-TiM 

Costs 

Cost  (Billions) 

V7! 

At  tbs  Dspot 

$  0.861 

At  20  Centers 

25.820 

Total 

$26,681 

30330i 

At  tbe  Depot 

$  1.218 

At  20  Centers 

36.560 

Total 

$37,778 

Annual  Cost  tot  tbs  Svataa 

($1,013) 

B.  A  Tlguc*  in  parcnthaaca  is  a  rsduction  in  sxpsnditurs. 
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TABLE  5-12 t 

AMNOAL 

MAIMTEMAMCE 

COST  SAVING  PBOVII«D  BY 

Year 

RBBOSTING  (Billions) 

First 

Second 

Third 

Fourth 

Fifth  and  after 

Personnel 

$0.0 

$1,374 

$4,122 

$6,870 

$8,244 

Parts 

0.506 

1.013 

1.013 

1.013 

1.013 

Total 

$0,506 

$2,387 

$5,135 

$7,883 

$9,257 
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Training  coat.  If  confrontad  by  a  naw  syataar  thosa  who  opacata  and 
■aintain  tha  systaa  would  raquira  training.  Tha  cost  of  this  axtra  training 
that  would  ba  naeessitatad  by  rabosting  will  now  ba  astiaatad.  This 
training  cost  can  ba  dividad  into  tha  cost  of  davaloping  tha  naw  eoursas  and 
tha  cost  of  teaching  thaa. 

Tha  cost  of  davalooinq  tha  naw  eoursas  is  astiaatad  in  tha  following 
way.  Table  S-13  shows  tha  relevant  AF  courses  currently  given  at  tha 
Acadeay.  These  are  tha  courses  that  would  have  to  ba  replaced  if  there  trare 
reheating.  Oalttad  frea  this  list  are  courses  that  would  still  ba  given 
under  rehostlng  without  significant  change  (a.g.,  courses  on  Jovial 
prograaaing,  on  tha  applications  prograas,  and  on  hardware  that  would  ba 
retained) ,  courses  that  would  not  ba  replaced  since  tha  subject  aatter  would 
not  be  relevant  under  rehostlng  (e.g.r  courses  on  the  display  channel 
hardware) ,  and  courses  not  needed  because  there  will  be  only  one  systea  for 
the  CCC  and  display  channel  rather  than  three  as  at  present. 

The  eoursas  that  would  ba  given  on  the  new  systea  would  probably  be 
structured  differently  froa  the  currant  eoursasi  navartbalasSf  the  courses 
listed  in  Table  5-13  will  be  used  as  a  rough  guide  to  what  the  naw  eoursas 
night  look  like.  (AT  courses  are  not  considered  since  there  are  only  a  few 
of  then  and  since  they  would  be  relatively  untouched  by  rehosting.)  These 
courses  together  last  a  total  of  81.8  weeks,  or  3^272  hours.  This  figure 
will  ba  increased  to  3«S00  hours  to  allow  for  any  new  courses  not  capturad 
in  Table  5-13. 

Tha  coat  of  davaloping  these  naw  courses  can  now  ba  astiaatad.  The 
chief  of  tha  Autonation  Section  of  tha  Airway  Facilities  Branch  of  tha 
Acadany  has  provided  the  rule  of  thuab  that  the  ratio  of  developaant  hours 
to  course  hours  is  30  to  1.  Therefore,  the  nuaber  of  hours  needed  to 
develop  the  naw  courses  is 

3,500  X  30  "  105,000. 

Tha  Budget  Division  of  tha  Aeronautical  Canter  puts  a  cost  of  fl6.94  on  each 
hour  of  produetiva  tiaa  spent  by  AF  instructors.  (This  is  calculated  by 
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TABLE  5-13:  BBLEVANT  AIBNAX  FACILITIES  COORSES  OTFERED  AT  THE  ACADENX 


Cost  psr 


Course 

Mumber 

Length 

(weeks) 

student 

per  week 

43458 

8 

$119 

43459 

8 

101 

43460 

6 

125 

43462 

20 

143 

43437 

4 

123 

43432 

3 

136 

43468 

5 

112 

43469 

10 

165 

43470 

6 

126 

43471 

5 

108 

43489/ 

90/91 

6.8 

MA 

Total 

Meeks 

81.8 

Title 

IBM  9020  System  FaBllisEisation  end  BAL 
Programming 

IBM  9020  Input-Output  Equipment 
IBM  9020  A/0  PAM  and  System  Control 
IBM  9020D/E  Processing 

IBM  2314-Al  Direct  Access  Storage  Facility 

System  Maintenance  Monitor  Console  for 
Technicians 

OS-360  and  DASF  Prograaoilng  Techniques 

FOP  and  Monitor  for  Systems  Performance 
Specialists 

MAS  Operational  Program  for  Engineers 
MAS  System  Interface 
CCC  for  Engineers 
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talcing  tha  avacaga  salary  of  Kf  Inatructora  of  $27,748,  incraasing  it  by  27 
paccant  to  $35,240  to  allow  for  banafita,  laava,  and  training,  and  dividing 
by  2080.)  Tharafora,  tha  astiaatad  coat  of  davaloping  tha  naw  couraaa  ia 

$105,000  X  $16.94  -  $1,778,700. 

It  would  probably  ba  tha  caaa  that  thaaa  couraaa  would  ba  davalopad  largaly 
by  a  contractor  rathar  than  tha  FAh. 

Tha  coat  of  taachinc  tha  naw  couraaa  will  ba  aatiaatad  by  davaloping  an 
aquation  that  axpraaaaa  tha  coat  of  training  a  atudant  aa  a  function  of  tha 
langth  of  hia  training  and  than  by  coabining  thia  with  information  on  tha 
amount  of  training  that  would  ba  raquirad. 

Tha  main  aourca  uaad  to  darlva  tha  coat  of  training  a  atudant  ia  a 
documant  ccmpilad  by  tha  Budgat  Oiviaion  of  tha  Aaronautical  Cancar 
[AER081] .  Thia  documant  aatimataa  tha  coat  of  teaching  aach  couraa  Par 
atudant.  Tha  coat  ia  divided  into  three  componanta.  Tha  firat  coat  ia 
paraonnal  compenaation  and  banafita,  i.a.,  the  inatructor'a  timer  thia 
includaa  both  tha  time  apant  in  tha  claaaroom  and  tha  time  apant  in 
preparation.  Thia  coat  ia  baaed  on  the  actual  houra  reported  by  the 
inatructora.  Tha  aacond  coat  ia  auppllaa  and  couraa  material  auch  aa 
manuala.  Tha  third  coat  ia  overhead  to  cover  adminiatration,  buildinga,  and 
ao  forth;  thia  coat  would  not  ba  affected  by  reheating  and  la,  therefore, 
not  included  in  tha  aatimataa  of  training  coat.  For  example,  for  tha  9020 
D/B  procaaaing  couraa,  number  43462,  tha  par  atudant  coat  for  paraonnal 
compenaation  and  banafita  ia  $2,859  and  for  auppllaa  ia  $5,  for  a  total  of 
$2,864.  Since  thia  couraa  ia  20  weaka  long,  the  coat  par  atudant  par  weak 
ia  $143.  Tha  coat  of  other  couraaa  par  atudant  par  weak  ia  almilarly 
calculated  and  ahown  in  Tabla  5-13.  A  waightad  average  of  thaaa  coata  ia 
taken,  where  tha  waighta  are  proportional  to  tha  langth  of  tha  couraa,  and 
tha  raaultlng  average  coat  par  student  par  weak  is  $130. 

Tha  per  diem  rata  for  a  stay  of  two  weeks  or  longer  is  $31,  so  the  par 
diam  cost  is  $217  par  weak.  This  means  that  tha  average  coat  par  weak  for 
tha  course  and  tha  per  diam  for  a  student  is  $130  *  217  ■  $347,  which  is 


rounded  to  $350.  The  average  round  trip  travel  cost  to  OKlahoaa  City  is 
assuned  to  be  $400.  Therefore*  we  have  the  training  cost  equation 

TC(w)  -  $400  +  $3S0w, 

«rtiich  expresses  the  training  cost  for  a  single  student  as  a  function  of  the 
nuaber  of  weeks  w  he  is  at  school.  This  aquation  does  not  include  the 
salary  of  the  student  since  this  cost  would  be  incurred  even  if  be  were  not 
attending  the  course. 

The  next  task  is  to  estiaate  the  nuaber  of  people  tdto  would  require 
training  and  the  aaount  of  training  that  is  necessary.  To  do  this  the 
relevant  organizations  will  be  examined. 

Consider  the  en  route  centers.  The  two  relevant  organizations  are  the 
Air  Traffic  Service  (AT)  and  the  Airway  Facilities  Service  (AF) .  As  a 
first  approxiaation  the  division  of  responsibility  between  these 
organizations  is  that  AT  aalntains  the  software  in  the  CCC  and  AF  aaintains 
all  hardware  as  well  as  the  software  in  the  display  channel.  Typical  though 
not  invariable  staffing  patterns  for  AT  and  AF  at  a  center  are  shown  in 
Tables  5-14  and  5-15*  respectively.  Table  5-16  shows  the  AT  staffing  at  the 
FAATC.  (These  staffing  patterns  were  provided  by  FAA  personnel.  Table  5-15 
is  an  almost  perfect  match  with  a  similar  table  in  [ASI80*  p.  44]*  which* 
after  subtracting  out  secretaries*  shows  47  AF  mployees  at  each  center.) 

At  the  FAATC  AF  has  about  25  people?  roughly  15  work  on  the  9020A*  D*  and  B* 
and  10  «»ork  on  the  Raytheon  730.  RaD*  ARD-140r  at  the  Technical  Center  has 
roughly  20  people?  about  12  work  on  near-term  enhancements  and  8  on 
specifications  for  9020  replacement.  ACT-700,  which  operates  and  manages 
the  9020 's  at  the  Technical  Center*  has  about  100  relevant  employees.  There 
ue  also  a  number  of  contractors  at  the  center*  but  no  costing  of  contractor 
training  will  be  attempted. 

Considering  the  duties  of  each  organisation  and  what  would  be  affected 
by  rehosting*  estimates  of  the  amount  of  needed  training  have  been  made  and 
are  shown  in  Table  5-17.  This  table  shows  for  each  organisation  at  the  en 
route  centers  and  the  Technical  Center  the  estimated  number  of  people  who 
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would  n*«d  training  and  the  avacaga  langtb  of  tha  training.  Tha  langth  of 
tha  training >  whan  substitutad  into  tha  training  cost  aquation  abr.a  yialds 
tha  cost  par  studant  shown  in  Tabla  5-17.  Hultiplying  this  tiaas  tha  nuabar 
of  studants  yialds  tha  total  training  cost  for  aach  organisation.  It  is 
saan  that  tha  astiaatad  training  cost  is  $370,400  at  aach  center  and  is 
$630,000  at  the  Technical  Center. 

Table  5-17  also  shows  tha  cost  of  training  tha  instructors  at  the 
Acadaay.  The  rule  of  thuab  provided  by  tha  chief  of  tha  Autoaation  Section 
at  tha  Acadaay  is  that  tha  ratio  of  hours  needed  to  train  tha  instructors  to 
the  hours  of  course  tlaa  is  1.5  to  1.  Therefore,  tha  cost  of  training 
tha  instructors  is  astiaatad  to  be 

$3,500  X  1.5  X  $16.94  -  $88,935. 

Tha  final  entry  in  Tabla  5-17  is  tha  total  cost  of  teaching  tha  new 
courses  during  tha  transition.  This  cost  of  $8,127  aillion  is  obtained  by 
adding  the  cost  at  aach  canter  tiaas  20  to  tha  costs  at  tha  Technical  Canter 
and  at  tha  Acadaay. 

This  coaplatas  tha  astiaation  of  tha  cost  of  developing  and  teaching  tha 
courses  that  would  be  required  by  rahosting.  Because  of  doubts  about 
exactly  what  training  would  be  required  and  about  the  accuracy  of  tha  rules 
of  thuab,  these  astiaatas  should  be  thought  of  as  approxiaations  ratbar  than 
as  precise  astiaatas.  Oaittad  froa  these  astiaatas  are  tha  costs  of 
training  contractors  and  personnel  in  Washington  and  in  the  regional  offices. 

Suaaarv.  Tha  various  transition  costs  are  suaawrixad  in  Tabla  5-18, 
which  shows  that  tha  astiaatad  cost  of  raaodaling  tha  canters,  paying  tha 
extra  personnel  needed  during  tha  transition,  and  developing  and  teaching 
tha  courses  coaas  to  about  $36,906  aillion. 

5.6  Procraa  Wanacaaent  and  Support  Cost 


If  rahosting  is  adopted  tha  FAA  will  incur  adainistrativa  costa  as  it 
plans,  reviews,  overseas  tha  procuraaant,  and  provides  general  support.  Tha 


TABLE  5-14 t  TYPICAL  AXB  TBAPPXC  SBBVICB  BTATFING  AT  A  CEHTBR 


1  data  aysteu  officer 

S  operatlona  specialists  (who  aonitoc  overall  operation  of 
the  coBiputer  frcai  the  AT  viewpoint) 

4  adaptation  specialists 
4  testing  specialists 
7  prograsBMrs 
21  Total 


TABLE  5-lS:  TYPICAL  AIRKAY  PACILITIBS  SERVICE  STAFFIB6  AT  A  CENTER 


s 

syatea  perforsuince  specialists 

1 

systea  perforaance  officer 

3 

staff  engineers  or  technicians 

in  depth 

7 

coaputer  operators 

; 

10 

systea  engineers  and  assistant 

systea  engineers  j 

20 

technicians 

46 

Total 

1 

I 


TABLE  5-16 <  AIR  TRAFFIC  SERVICE  STAFFING  AT  THE  FAA  TECHNICAL  CENTER 

10  design  teas 
22  production  teas 
20  tasting  teas 
16  field  support 
2  docusMntation 
6  supervisors 
76  Total 
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TABLB  5-17:  TSAIMING  COST  NBCBSSITATED  BY  BEBOSTIHG 


NuotMr  Kaquiring 

Lan^th  of 

Coat  par 

Total 

— 

Trainina 

TrainitM  (waakal 

)  Studant 

Coat 

At  each 

Cantac 

AT 

46 

20 

87,400 

8340,400 

AT 

Total 

12 

6 

2,500 

30.000 

8370,400 

At  tba 

Taetmical 

Cantair 

AF 

20 

32 

811,600 

8232,000 

AT 

SO 

6 

2,500 

125,000 

R  6  0 

20 

20 

7,400 

148,000 

ACT-700 

Total 

50 

6 

2,500 

125.000 

8630,000 

At  ttM 

ACAd««y  88,935 


Total  Coat  ot  Taaching  Couraoa 


88,126,935 
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FAA  has  estiauitad  that  this  coat  would  ovar  the  Ufa  of  tha  procuramant 
aaount  to  $41.2  aillion.  Tabla  5-19  shows  tha  braakdown  by  yaar.  Tha  tasks 
in  aach  yaar  can  ba  saan  by  looking  at  tha  procuramant  schadula  in  Ch.  7. 

5.7  StaM££ 

Tha  costs  that  ara  ralavant  to  rahosting  fall  into  two  typas.  Tha  first 
typa  is  tha  front-and  costa  that  ara  iiwurrad  initially  to  gat  tha  rahost 
systam  oparational  at  all  sitas.  Table  S-20  summarizes  tha  initial  cost 
estimates  that  have  bean  made  throughout  this  chapter;  the  total  is  $241.0 
million  if  Amdahl  470/V7's  ara  procured  or  $303.4  if  IBM  30330 's  are 
procured.  (HQ,  TC,  and  AC  stand  for  FAA  headquarters,  tha  Technical  Center, 
and  tha  Aeronautical  Center,  respectively.) 

The  second  typa  of  coat  is  the  annual  cost  of  operating  and  maintaining 
tha  systems.  Tha  annual  cost  savings  provided  by  rehosting  ara  summarized 
in  Tabla  5-21.  Once  tha  three  year  shakadom  period  is  completed,  the 
annual  saving  in  personnel  cost  and  parts  cost  is  estimated  to  ba  $9.3 
million. 

Tha  point  about  Table  5-20  to  bo  emphasized  is  tha  dominance  of  tha 
hardware  cost.  Tha  hardware  acquisition  cost  is  about  half  of  the  total 
cost.  Moreover,  tha  next  largest  cost,  tha  cost  of  tha  initial  stock  of 
spare  parts,  is  closely  tied  to  tha  hardware  acquisition  cost.  The 
dominance  of  hardware  acquisition  means  that  efforts  to  bold  down  this  cost 
can  have  a  much  bigger  payoff  than  efforts  to  bold  down  other  costs.  Also, 
uncertainty  over  this  cost  dwarfs  all  the  other  uncertainties. 

App.  F  discusses  the  idea  of  saving  on  the  hardware  acquisition  cost  by 
making  a  partial  replacement,  i.e.,  rehosting  at  some  sites  but  keeping  the 
9020 's  at  others.  The  ooneluaion  is  that  if  V7's  are  procured,  the  initial 
cost  of  rehosting  falls  from  $241.0  million  to  $107.0  if  there  is 
replacement  at  5  centers  instead  of  20;  if  there  is  replacement  at  10 
centers,  the  initial  cost  is  $151.7.  The  long-term  annual  saving  of  $9.3 
aillion  falls  to  $1.6  aillion  if  there  is  replacement  at  10  centers  and  is  a 
$0.8  million  annual  increase  if  there  is  replacement  at  5  centers.  These 
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TABLE  S-lBl  SOMMABX  OF  THE  TRANSITION  COSTS 
(millions) 


Remodeling  23  sites  $  23.000 
Extra  personnel  at  20  sites  4.000 
Course  Development  1.779 
Teaching  8.127 

Total  Transition  Cost  $  36.906 


TABLE  S-19<  PROGRAM  MANAGEMENT  AMD  SUPPORT  COST  (millions) 


I 

.1 

( ‘ 
I 

t  I 


TABLE  S-20:  INITIAL  COSTS  INCORBED  BY  RBHOSTIN6  (ailllons  of  dollars) 


Site 


Ifi 

TC 

ABTCC’s 

Total 

Software 

5.8 

5.8 

Hardware 

Engineering 

0.3 

0.3 

Acquisition 

V7 

10.8 

5.4 

107.6 

123.8 

30330 

15.2 

7.6 

152.3  or 

175.1 

Testing 

3.8 

2.5 

6.3 

Maintenance 

Initial  coat 


V7 

0.9 

25.8 

26.7 

30330 

1.2 

36.6  or 

37.8 

Transition  Cost 

Resndeling 

2.0 

1.0 

20.0 

23.0 

Extra  personnel 

Developing  courses 

1.8 

4.0 

4.0 

1.8 

Teaching  courses 

0.6 

0.1 

7.4 

8.1 

Progcaa  aumagesMnt 

and  Support  41.2  41.2 

Total 


V7 

3033U 


241.0 
or  303.4 


table  S-21t  AHHOAL  MAIMTEHAMCE  COST  SAVING  PROVIDED  BY  REHOSTING 


Source i  Table  5-12 


estinatcs  are  incon^lete  since  they  do  not  reflect  the  Inconvenience  that 
would  result  from  there  being  two  different  systems  in  the  field. 


Another  suggestion  for  decreasing  the  cost  is  to  follow  an  alternate, 
less  finely  tuned  rehosting  approach  than  that  assumed  in  this  report.  This 
alternate  approach  would  keep  the  present  peripherals  and  make  fewer  changes 
in  the  software;  this  would  entail  a  much  higher  VM  overhead.  This  approach 
would  cut  perhaps  $2S  million  off  the  cost  (largely  because  the  cost  of  new 
peripherals  would  be  saved)  and  perhaps  12-15  months  off  the  procurement 
schedule  (since  a  much  less  extensive  system  development  would  be  needed) . 
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6.  TRMISITZON 


Th*  FAA  has  sstablishsd  tha  requirsMnt  that  whan  tha  axiating  an  touta 
coatputacs  are  caplacad>  tha  transition  from  tha  old  to  tha  naw  systaa  must 
ha  smooth  and  troubla-frea  so  that  saCaty  is  not  jaopardizad.  A  key 
ingredient  in  meeting  this  requirement  is  having  a  90  day  period  o£  parallel 
operation  of  the  two  systems  (FAA80«  p.l7].  This  insures  that  there  will  be 
a  proved,  reliable  back-up  if  there  are  any  problems  with  the  new  system. 

If  this  somewhat  intricate,  non-standard  transition  is  to  be  accomplished 
successfully,  three  types  of  problems  oust  be  dealt  with: 

a  ramodaling  problamsi 
a  technical  problems; 
a  parsonnal  problems. 

Each  problem  area  will  now  be  briefly  discussed. 

Remodeling  problems.  Because  of  the  need  for  parallel  operation  of  the 
two  systems,  there  must  be  enough  ro<»  in  the  centers  to  accoaaodate  both 
systems  simultaneously.  Once  the  direct  access  radar  channel  is 
field-tested  and  is  operating  normally,  the  plan  is  to  remove  the  broadband 
radar  from  the  centers;  this  will  free  up  about  5000  square  feet  (M0U.81, 
pp. 39-40].  If  this  happens  by  1984  as  planned,  then  there  will  be 
sufficient  space  for  the  two  systems  and  there  will  be  no  need  for  major 
construction.  (This  space,  however,  might  be  in  an  undesirable  location 
such  as  a  basMMnt.)  Since  only  remodeling  would  be  required,  no 
significant  transition  problems  are  expected.  If,  however,  the  broadband  is 
not  removed  on  schedule,  then  it  is  possible  that  there  could  be 
insufficient  space,  and  there  could  be  a  need  for  major  construction  or  for 
temporary  shelters. 

Technical  problems.  The  technical  problems  posed  by  parallel  operation 
fall  into  three  areas.  First,  it  is  necessary  to  be  able  to  feed  the  input 
signals  into  either  (or  both)  of  the  two  systems.  These  inputs  include  both 
external  inputs  (e.g.,  from  radar)  and  internal  inputs  (e.g.,  from 
controllers) .  Mhile  all  these  signals  have  a  relatively  low  data 
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tranaalsaion  cats,  tha  Intatfaca  to  thaaa  signals  will  bs  at  ths  channal 
aids  of  ths  PAN's  and  DAO's  rathsc  than  at  ths  tsrmination  of  ths  input 
circuits  sines  ths  PAN's  and  DAO's  will  bs  tsussd  in  ths  eshost  systsa.  Ths 
basic  proeadurs  foe  providing  acesss  to  ths  PAN's  and  DAO's  froN  ths  currant 
and  ths  rshost  syatsas  is  to  axtsnd  ths  currant  capability  of  the  line 
controllers  that  allow  thsa  to  bs  connected  to  aors  than  one  channel.  After 
this  aulti-channel  interface  capability  has  been  expanded,  then  ths  line 
controllers  would  bs  connected  to  the  rehost  systsa.  The  procedures  and 
schedules  for  this  shared  access  to  the  input  signals  aust  be  carefully 
planned  to  ensure  continuous  operation  of  the  entire  systea  and  ainiaize  the 
disruption  to  each  line  controller  as  the  shared  access  is  iapleaented.  nie 
physical  placeaent  of  new  cables  under  the  coaputer  r-coa  floor  will  require 
careful  planning  as  well  since  a  large  nuaber  of  cables  have  already  been 
located  in  the  work  area  under  the  floor.  While  the  display  channel  using 
the  Raytheon  730  has  a  different  line  controller  for  the  input  circuits  than 
the  90200,  this  interface  is  not  expected  to  be  aore  coaplex  titan  that  for  a 
DAO. 


Second,  if  the  rehoat  systea  fails,  it  is  necessary  that  the  9020  systea 
be  able  to  take  over  and  prevent  a  serious  interruption  in  service.  This 
can  be  done  provided  that  the  9020  has  access  to  a  current  data  base.  There 
are  several  scheaas  that  could  provide  this  current  data  base.  In  one 
leading  scheae  the  inputs  are  fed  into  both  the  old  and  new  systeas,  and 
both  systeas  operate  continuously.  This  aeans  that  if  the  new  systea  fails, 
then  the  old  systea  has  a  current  data  base  since  it  has  been  aaintaining 
it.  The  switch  frea  the  new  to  the  old  systea  would  probably  be  done 
annually,  so  there  would  be  a  lag  between  the  occurrence  of  a  failure  and 
the  tiae  that  this  failure  is  detected  and  the  switch  thrown.  Once  the 
switch  is  thrown,  it  would  only  be  a  natter  of  ailliseconds  until  the  old 
systea  ccsms  on  line  and  is  providing  full  service.  In  short,  it  does  seen 
possible  to  aake  a  saooth  transition  fron  a  failed  systea  to  the  other 
systea. 


Third,  it  is  necessary  to  feed  output  signals  fron  either  the  current 
systea  or  the  rehost  systea  to  their  appropriate  destinations.  These 
outputs  fall  into  two  groups—outputs  for  the  display  generators  and 
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outputs  to  low  data  transalasion  rata  Intarnal  and  axternal  circuits.  Tbs 
transition  procadura  Cor  tha  sacond  group  of  outputs  would  ba  part  of  tba 
input  transition  procadura  alnca  tba  saaa  lina  controllars  ara  usad  for 
input  and  output.  Sbarad  accass  to  tba  display  ganarators  would  ba  providad 
in  a  mannar  siailar  to  that  for  tha  lina  controllars. 

Parsonnal  oroblans.  Tba  naad  to  at  all  tinas  hava  parsonnal  at  aach 
cantar  to  maintain  tha  systan(a)  laads  to  t%K>  possibla  problans.  First, 
bafora  raplacenant  occurs,  a  larga  numbar  of  parsonnal  at  aach  cantar  will 
raquira  training,  but  this  training  must  ba  scbadulad  to  that  anough 
parsonnal  ara  laft  at  tha  cantar  to  provida  adaquata  support.  Sacond, 
during  tha  pariod  of  parallal  oparation  thara  must  ba  sufficiant  parsonnal 
to  support  both  systams. 

No  plan  to  daal  with  thaaa  problams  will  ba  spallad  out,  but  it  is  claar 
that  thasa  problems  can  ba  daalt  with.  For  axampla,  having  tha  contractor 
hira  and  train  axtra  parsonnal  that  would  float  from  cantar  to  cantar  as 
naadad  would  ba  ona  possibla  way  of  daaling  with  thasa  problams.  Tha 
specific  plan  adopted  should  daal  with  a  numbar  of  guastions. 

a  Whan  should  tha  naw  systMi  ba  installed  at  tha  Academy  in  Oklahoma 
City?  If  installed  too  lata,  it  will  not  ba  available  whan  it  is 
needed  for  tha  initial  surge  of  training. 

a  How  will  training  ba  scheduled?  If  training  is  too  early,  skills 
will  detariorata  bafora  tha  new  system  is  installed;  if  training 
occurs  just  bafora  raplacamant,  this  might  leave  the  cantar 
undermanned. 

a  How  much  of  tha  training  can  ba  computer-based  instruction  that 
occurs  at  tha  canters? 

Thasa  ara  all  problams  that,  while  not  insuparabla,  do  raquira  careful 
planning. 


stvmmmrv.  It  Is  s««n  that  th«r«  «r«  «  niabar  of  coaplax  problns  that 
attand  tha  transition  pariod  of  parallal  oparatlon  of  the  two  aystau. 
Though  a  haadlass  transition  would  run  afoul  of  thasa  problau,  it  saaas 
fair  to  say  that  thay  can  ba  auceassfully  handlad  by  advanca  thinking  and 
cartful  preparation. 
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Englnaaring  and  Davalopmant,  FAA,  January  9,  1981. 


105 


7.  PBOCanMEMT  SCaSOOLB 


In  deciding  whether  rehoeting  is  e  suitable  SMthod  Cor  extending  the 
life  of  the  current  systea>  two  questions  arise.  First,  when  will  the 
current  systea  need  to  be  upgraded  in  order  to  avoid  capacity  probleaw? 
Second,  could  rehosting  be  accoaiplisbed  fast  enough  to  provide  the  needed 
upgrading?  This  chapter  will  not  discuss  the  question  of  when  and  if 
upgrading  will  be  needed;  that  question  is  being  examined  by  other  studies 
being  conducted  by  the  FAX.  This  chapter  will,  hotMver,  examine  the 
question  of  how  quickly  reheating  could  be  accomplished.  The  goal  is  to 
estimate  how  much  time  elapses  between  the  time  the  FAA  issues  the  request 
for  proposals  (BFP)  and  the  time  that  the  rehost  system  is  operating 
normally.  This  elapsed  time  is  estimated  by  considering  the  six  stages  of 
the  procurement  that  follow  the  issuance  of  the  RFP;  the  duration  of  each 
stage  is  derived  from  discussions  with  FAA  personnel. 

First,  potential  contractors  prepare  proposals  that  spell  out  the 
approach  to  rehosting  that  the  contractor  plans  to  follow;  3  months. 

Second,  the  FAA  evaluates  the  proposals  and  awards  a  contract;  6  months. 

Third,  the  contractor  who  is  selected  develops  the  software  and  hardware 
that  his  rehosting  approach  requires;  21  months. 

Fourth,  the  contractor  delivers  and  installs  a  system  at  the  FAA 
Technical  Center  (FAATC) ,  and  this  system  is  then  subjected  to  full  testing; 
9  months. 

Fifth,  the  system  is  installed  at  an  ABTCC  and  fully  tested  and  brought 
to  the  point  where  it  is  operational;  6  months. 

Sixth,  the  system  is  installed  and  made  operational  at  the  remaining 
ARICC's:  24  months. 
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Tabl*  7-1  suMMcizaa  the  stages  and  the  elapsed  tiaea  oC  this  prudent, 
conservative  procureawnt  schedule.  Froa  the  tiae  that  the  RFP  Is  issued 
until  the  first  syatea  is  fully  tested  and  operating  noraally  at  an  ABTCC, 

45  aontha  (3  years,  9  aonths)  elapses;  froa  the  decision  to  rehost  until 
replaceaent  is  coaplete,  69  aontha  (5  years,  9  aonths)  elapses.  This  aeans 
that  if  the  HFP  were  issued  on  1  July  1982,  then  the  first  rehost  systea  at 
an  ARTCC  would  be  operational  on  1  April  1986;  the  rehost  systeas  would  be 
operational  at  all  sites  on  1  April  1988. 

It  is  possible  that  the  need  for  rehosting  would  be  seen  as  urgent  and 
that  Congress  would  aandate  that  rehosting  be  accoaplished  as  quickly  as 
possible,  with  speed  being  achieved  by  cutting  down  adainistrative  delays. 
This  report  will  not  peculate  on  how  auch  the  proeureaent  schedule  could  be 
coapressed  under  these  circuastances. 

It  has  been  suggested  that  the  aainfraaes  could  be  leased  rather  than 
purchased  to  shorten  the  proeureaent  cycle;  this  approach  has  three 
probleas.  First,  since  developing  the  systea  rather  than  acquiring  the 
aainfraaes  is  the  bottleneck  in  the  proeureaent,  leasing  would  not  speed  the 
proeuresMnt.  Second,  since  three  years  is  typically  the  break-even  point 
for  a  lease  and  since  these  coaputers  would  probably  be  in  place  for  acre 
than  three  years,  leasing  would  end  up  costing  aore  than  purchasing.  Third, 
the  user  is  typically  not  allowed  to  aaintain  the  leased  coaputers,  and  this 
would  Interfere  with  the  FAA's  providing  the  type  of  aaintenance  required  by 
air  traffic  control. 

One  way  to  shorten  the  proeureaent  schedule  by  perhaps  12-15  aonths 
would  be  to  follow  the  alternate  rehosting  approach  aentioned  in  the  last 
paragraph  of  Ch.  5. 
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TABLE  7-1 1  THE  MtOCtnOMEET  SCHEDOLE 

Elapsed  Time 

(months)  Sta<ies  ot  the  Procurement 

3  Industry*  prepares  proposals 

6  FAA:  evaluate  proposals  and  awards  contract 

21  Contractor*  develops  software  and  hardware 

9  FAA  and  contractor:  test  system  at  the  FAATC 

6  FAA  and  contractor:  install  and  test  system  at 

the  first  field  site 

24  Cx>ntr actor:  installs  systems  at  the  remaining  centers 

69  Total 
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8.  GSONTH  POTENTIAL 


8.1  Introduction 

In  ord«c  to  ■inlalze  th'i  future  trauM  of  transitioning  to  a  new  syetex/ 
the  FAA  has  specified  that  any  replaceaent  systea  nust  be  abl*  to  evolve 
saootbly  over  the  next  few  decades.  In  particular,  any  replaceaent  systea 
aust  be  capable  of: 

e  Accoaaodating  new  hardware  so  that: 

+  The  capacity  of  the  syatea  can  be  increased  and  the  response  tiaes 
for  all  ATC  related  activities  can  be  aaintained  at  specified 
levels, 

*  Maw  hardware  technology  can  be  integrated  into  the  systea  in  an 
evolutionary  aanner. 

e  Accoaandating  the  evolution  of  ATC  functions  so  that: 

+  Existing  capabilities  can  be  refined  and  extended, 

*  New  capabilities  can  be  added  that  autoaate  acre  of  the  ATC 
process. 

The  ability  of  the  rehostad  systea  to  aeet  the  needs  will  now  be  discussed. 

8.2  Hardwte  Growth  Potential 

The  baseline  rehost  configuration  would  have  two  aainfraae  processors, 
each  with  a  processing  capacity  of  5,900  KOPS  [LIA880,  p.  104].  This 
configuration  will  allow  for  growth  in  processing  capacity  by  upgrading  the 
processors  since  even  an  average  sized  aainfraae  processor  by  current 
standards  would  have  aore  processing  capacity  than  the  current  9020 
systeas.  That  is,  a  CCC  has  a  total  processor  eapsoity  of  790  KOPS  and  1452 
KOPS  [LXAS80,  p.  104]  for  9020A  and  90200  configurations,  respectively. 
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Note  that  the  usable  capacity  of  a  CCC  is  about  25%  less  than  the  total 
capacity  due  to  meaory  contention  and  prograa  elaaent  queueing.  MainfrasM 
pcocessocs  with  7  to  9  tiaas  the  capacity  of  a  9020D  systea  [XBN80  and 
AMOA80]  have  been  announced  for  delivery  in  1982.  By  the  end  of  the 
eighties,  Systea/360  instruction-coapatlble  aainfraae  processors  with  25  to 
50  tiaes  the  capacity  of  a  9020D  systea  tNlSE80]  should  be  available.  By 
cosqtarison,  the  estiaates  for  the  processing  requireaents  for  a  fully 
autoaated  ATC  systea  are  10  to  15  tiaes  the  processing  capacity  of  a  9020D 
systea  [CLAP79] . 

The  aeaory  sise  of  the  current  9020  systea  is  Halted  by  signal 
propagation  delay  problaas  to  3  aegabytes  for  a  9020A  and  5  aegabytes  for  a 
9020D  or  9020B.  Current  technology  aeaory  (solid  state  instead  of  core)  is 
physcially  auch  saaller  in  sise  than  the  9020  aeaory  sise  so  that  signal 
delays  are  no  longer  a  problea.  In  addition,  current  technology  aeaory  is 
faster,  cheaper  and  acre  reliable.  Op  to  32  aegabytes  of  physical  aeaory 
can  be  attached  to  candidate  rehost  ccHiputers.  Even  larger  physical 
aeaories  arc  possible  since  the  near»tera  Halt  is  based  on  effective  usage 
rather  than  physical  constraints. 

The  current  sise  of  the  NAS  aonitor  and  application  software  is  about 
4.1  aegabytes  [FAAT81] ;  the  sise  of  the  9020E  software  is  less  than  1 
aegabyte.  These  prograas  in  crabination  with  a  virtual  aachlne  aonitor  of 
1.5  to  2  aegabytes  gives  a  ainiaua  aeaory  requireaent  for  the  rehost 
configuration  of  6.6  to  7.1  aegabytes.  The  expected  aeaory  requireaents  of 
near  tera  ATC  enhancwMnts  is  less  than  8  aegabytes. 

Mass  storage  devices  (dish  and  tape)  continue  to  improve  with  respect  to 
capacity,  response  tiae  and  reliability.  For  exas^le,  disks  with  200  to  500 
aegabyte  capacities  are  currently  in  wide  use;  the  capacity  of  a  2314  disk 
used  with  the  9020  systea  is  25  aegabytes.  Disks  now  on  the  aarket  have 
about  one  half  the  access  and  latency  delays  of  2314  disks  and  about  3  tiaes 
the  data  transfer  rate  of  2314  disks.  At  the  present  tiae,  about  15 
aegabytes  of  inforaation  is  stored  on  the  2314  disks  in  support  of  ATC 
operations  (KAHD77] .  Nblle  the  aaount  of  disk-resident  inforaation  is  not 
expected  to  increase  significantly,  the  capacity  of  a  single  replaceaent 
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disk  would  allow  at  least  10  tiaiaa  oora  Infocaation  to  ba  dlsk-casldant. 

Note  that  tha  larga  physical  aaiaocy  of  tha  basallna  cahost  configuration 
would  aliainata  tha  naad  for  disk  buffaring  of  ptogtaa  alaaants  (PE's)  which 
account  for  about  50%  of  disk  activity  (KAMO??] . 

Tapa  dcivas  with  8  tiaaa  tha  capacity  and  9  tiaas  tha  transfac  rata  of 
2401  tapas  ara  currantly  in  usa.  Slnca  data  logging  (SAR,  REHOM  and  DLOG) 
raprasant  all  of  tha  tapa  usaga«  capacity  or  rasponsa  tiaa  naads  ara  not 
axpactad  to  changa  froa  tha  currant  situation.  In  addition,  the  tape  drives 
would  ba  assigned  dedicated  channels  in  tha  baseline  rehost  configuration  so 
that  channel  utilisation  or  contention  issues  in  the. currant  system  [KAM0771 
would  ba  ainiaiisad. 

Tha  above  coasMnts  and  the  discussion  in  Sec.  3  indicate  that  tha 
baseline  rehost  configuration  will  support  growth  in  air  traffic  and  ATC 
functions  using  existing  hardware  coaponants  that  have  significant 
parforaanca  characteristics  (4  tiaas  tha  processing  capacity  of  a  9020D 
systaa)  but  not  aaxiaal  characteristics  by  currant  coaaarcial  standards. 
Since  the  mas  software  can  be  rahostad  without  aodifying  tha  computer 
hardware,  tha  processing  capacity  of  tha  rehost  computer  can  ba  upgraded. 

For  example,  the  Amdahl  470/V7  can  ba  field-upgraded  to  a  nodal  V8  and 
provide  an  increase  in  processor  capacity  that  Is  estimated  to  ba  7  percent 
by  one  source  [LlASeO,  p.  104]  and  23  percent  by  another  [HEMK81,  p.  14]. 

Tha  survivability  of  360  instruction-coapatible  processors  is  assured  due  to 
tha  very  large  investment  in  software  for  this  type  of  processor.  The 
iapact  of  the  transition  to  each  upgrade  option  on  the  ATC  operations  would 
be  niniaal  since  tha  360  instruction-coapatible  computers  and  peripherala 
have  become  da  facto  standards  and  aarket  forces  ensure  that  only  fully 
compatible  devices  are  offered  for  sale. 

8.3  Software  Growth  Potential 

While  the  growth  potential  of  the  hardware  has  been  described  in  largely 
quantitative  teras,  the  software  growth  potential  is  difficult  to  quantify 
and  will  be  described  in  qualitative  teras.  Software  evolution  can  proceed 
in  two  ways  —  in  a  gradual,  incremental  extension  of  the  rehosted  MAS 


aoCtwara  or  in  a  discontinuous  raplacaaant  of  tha  rahostad  NAS  softwara.  In 
aithar  casa,  tha  ovarall  softwara  organization  for  tha  basalina  rahost 
configuration  would  ba  basad  on  a  virtual  nachina  concapt  that  allows  aany 
procassas  to  procaad  concur rantly  and  as  indapandantly  as  nacassary.  Tha 
virtual  aacina  concapt  would  naad  an  axtansion  to  allow  afficiant  and 
rasponaiva  cniinnicationa  batwaan  cooparating  subsystams  operating  in 
different  virtual  procassas. 

In  the  incraawntal  extension  case,  the  current  NAS  application  software 
and  an  adapted  NAS  aonitor  would  provide  tha  kernel  for  developing  new  ATC 
processes.  However,  interfacing  new  processes  with  or  revising  and 
augaanting  tha  existing  NAS  application  software  will  continue  to  ba  a 
difficult  task  due  to  the  strong  data  coupling  between  softwara  aodules  and 
tha  highly  optiaizad  assaably  language  coda  used  for  soae  softwara  aodules. 
Nhila  rahosting  tha  NAS  application  software  will  ansura  ATC  functional 
continuity,  this  rahosting  will  also  preserve  all  tha  aaintananca  costs  and 
problaas  of  this  softwara. 

Tha  second  way  that  tha  softwara  alght  change  is  through  a  rewrite  and 
raplacownt  of  all  (or  at  least  a  significant  portion  of)  tha  NAS  coda.  Tha 
idea  is  that  after  rahosting  has  bean  adopted  and  has  taken  care  of  tha 
short-run  capacity  problesw,  longer  run  problems  can  possibly  ba  dealt  with 
by  using  modern  softwara  engineering  techniques  to  develop  new  softwara  that 
will  allminata  tha  disadvantages  of  the  current  softwara  and  taka  advantage 
of  tha  capabilities  of  the  new  hardware.  This  new  softwara  would  not  only 
provide  more  reliable  and  maintainable  software  support  for  the  current  ATC 
functions  but  would  provide  a  more  viable  baseline  for  supporting  the 
evolution  of  ATC  functions.  Note  that  cost  for  a  software  rewrite  will  be 
substantial  and  this  cost  has  neither  been  estimated  for  the  purposes  of 
this  report  nor  included  in  any  cost  calculations  in  Chapter  5. 
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A.  SYSTEM  AVAILABILITY  AND  SYSTEM  MTBF:  DETAILED  ANALYSIS 

A.l  Purpose  and  Organization  of  this  Appendix 

The  purpose  of  this  appendix  Is  to  show  how  the  estimates  of  system 
availability  and  system  mean  time  between  failure  (MTBF)  given  In  Sec.'s  2.2 
and  2.5  are  obtained.  First,  considering  only  hardware  failures.  Sec.  A. 2 
and  A. 3  explain  the  principles  used  to  estimate  system  availability  and 
MTBF,  respectively.  Second,  Sec.  A. 4  extends  the  analysis  to  cover  not  only 
hardware  failures  but  also  software  failures.  The  principles  In  each 
section  are  Illustrated  by  calculating  the  availabilities  and  HTBF's  for  the 
90200/9020E  system  and  for  the  rehost  system  under  the  baseline  assumptions. 

A. 2  System  Avail  ability;  Hardware 

A. 2.1  Principles 

This  subsection  shows  how  system  availability  can  be  estimated  from 
Information  about  failure  rates,  repair  rates,  and  configurations. 

Consider  a  single  unit  with  an  MTBF  of  1/x.  Let  the  mean  time  to  repair 
be  1/ti.  Define  a  "cycle"  to  start  at  the  moment  a  unit  Is  placed  In  service 
following  a  repair  and  to  last  until  the  next  moment  when,  having  failed  and 
been  repaired,  the  unit  Is  again  placed  In  service.  A  cycle,  therefore. 
Includes  the  time  spent  operating  and  the  time  spent  being  repaired.  The 
expected  time  spent  operating  In  a  cycle  Is  1/x,  and  the  expected  time  spent 
being  repaired  Is  1/u,  so  the  expected  length  of  the  cycle  Is 

X  U  Xlt 

The  fraction  of  the  time  spent  operating  Is 

1/x  .  u 

U  +  x)/xu  I*  +  X  * 

Therefore,  (t/(u  +  x)  Is  this  single  unit's  availability,  I.e.,  the 
probability  that  It  Is  operating  at  any  randomly  chosen  point  In  time. 


114 


Consider  a  subsystea  that  contains  n  Identical.  Independent  units; 
suppose  that  at  least  m  of  the  units  must  be  working  to  prevent  a  failure  of 
the  subsystem.  Letting  A(n,ffl)  stand  for  the  availability  of  this  subsystem 
of  n  units,  m  of  which  must  operate  to  prevent  a  failure,  we  have 

A(n.m)  -  'Z  CC(n.l)  (— ^)"'^].  (1) 

l-m  “  *  ^  ^  ^ 

where 

*  firU-ir.  ’ 

which  Is  the  number  of  different  combinations  of  1  objects  that  can  be 
chosen  from  a  set  of  n  objects.  The  values  of  the  availability  function 
that  are  used  below  are 


A(2  11  *  2  (  **  1  ( 

^  'u  +  x'  '»»  +  x'  V  +  x' 

_  2uX  +  4^ 

(TTH? 

*  1  -  (— • 

4  +  X 

A(3  2)  ■  3  {  »  )2  I _ X _ X  ^  , _ 4 _ \3 

■*  'n  +  x'  4  +  x'  4  +  x' 

,  34^X  4^ 

(4  +  X)3  ' 

M6.5)  .  6  (^)S  (^1 

_  64®X  *  4® 

(4  *"xlfi'  ' 

These  formulas  allow  the  availability  of  any  subsystem  to  be  estimated  once 
X  and  4  are  known. 


It  will  now  be  explained  how  the  availability  of  a  system  Is  built  up 
from  the  availability  of  Its  subsystems.  If  a  system  Is  composed  of  a 
number  of  Independent  subsystems  such  that  a  failure  of  any  subsystem  causes 
a  system  failure,  then  the  system  availability  A^  Is  the  product  of  the 


subsystem  availabilities.  For  example.  If  there  are  two  Independent 
subsystems  with  availabilities  and  A2,  then 

Aj  «  AjAg.  (2) 

If  a  system  Is  composed  of  two  Independent  subsystems  and  If  a  system 
failure  occurs  only  If  both  subsystems  fall,  then  the  system  availability  Is 
the  probability  that  at  least  one  of  the  subsystems  works,  I.e., 

Aj  -  1  -  (l-Ajjd-Ag) 

■  Ai  +  A2  -  Aj^A2.  (3) 

Equations  (2)  and  (3)  are  only  valid  If  no  two  subsystems  are  both 

failed  at  the  same  time.  While  this  might  happen.  In  the  problem  being 

-12 

considered  It  has  such  a  small  probability  (on  the  order  of  about  10  ) 

that  It  can  be  Ignored  without  damaging  the  results. 

A. 2. 2  Results 

9020D/9020E  system.  The  availability  of  a  system  with  a  90200  In  the  CCC 
and  a  9020E  In  the  display  channel  will  now  be  estimated  using  the  baseline 
component  MTBF's  from  Table  2-1  and  a  mean  time  to  repair  (MTTR)  of  one  hour. 

Table  A-1  shows  the  details  of  the  calculation  of  the  availability  of  a 
90200  system.  It  shows  for  each  subsystem  the  number  of  units  n  It  has,  the 

number  m  that  must  be  working  to  prevent  a  system  failure,  the  failure  rate 

(from  Table  2-1),  and  the  relevant  availability  fomula  from  Sec.  A. 2,  where 
It  Is  assumed  that  the  mean  time  to  repair  l/u  Is  1  hour.  The  last  column 

In  the  table  shows  the  subsystem  availability,  which  Is  obtained  by 

substituting  the  x  given  In  the  table  Into  the  formula.  Since  the  failure 
of  any  subsystem  causes  a  system  failure  of  the  90200,  the  availability  of 
the  90200  Is  0.99999151,  which  Is  the  product  of  the  subsystem 
availabilities. 
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TABLE  A-1:  AVAILABILITY  OF  A  90200  SYSTEM 


ConDonent 

n 

m 

_ ^ 

Fomula 

CE 

3 

2 

jm 

3x  +  1 
(1  +  \)^ 

SE 

6 

5 

1 

inr 

6x  +  1 
{1  +  X)® 

lOCE 

3 

2 

3ih 

3x  +  1 

(1  +  x)^ 

TCU 

3 

2 

i 

3x  +  1 
(1  +  x)^ 

SCU 

3 

2 

i 

“TOT 

3x  +  1 
(1  +  x)^ 

90200  System 


Avail  ability 
0.99999943 

0.99999851 

0.99999970 

0.99999996 

0.99999390 

0.99999151 
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since  the  90200/9020E  system  falls  If  the  9020D  or  the  9020E  falls,  the 
availability  of  the  9020D/9020E  system  Is  the  availability  of  the  90200 
times  that  of  the  9020E.  In  this  report  It  Is  assumed  that  the  90200  and 
9020E  are  equivalent  from  the  point  of  view  of  reliability.  Therefore,  the 
availability  of  a  system  with  a  902(n  In  the  CCC  and  a  9020E  In  the  display 
channel  Is  0.99998301,  which  Is  the  square  of  0.99999151. 

Rehost  system.  The  availability  of  the  rehost  system  will  now  be 
estimated.  Define  a  mainframe  to  be  a  CPU,  memory,  and  six  pairs  of 
channels.  Since  the  two  mainframes  run  In  parallel.  It  Is  first  necessary 
to  find  the  availability  of  a  single  mainframe. 

Table  A-2  shows  the  details  of  the  calculations  of  the  availability  of  a 
single  mainframe.  This  table  shows  that  for  a  single  mainframe  to  be 
available,  the  single  CPU  must  be  working,  the  single  memory  must  be 
working,  and  at  least  one  channel  In  each  of  the  six  pairs  must  be  working. 
This  table  also  shows  the  failure  rates  (from  Table  2>1)  and  the  formulas 
used  to  calculate  availability  (from  Sec.  A. 2).  The  last  column  shows  the 
availability  of  each  subsystem,  and  by  multiplying  these  three 
availabilities  together  the  availability  of  a  single  mainframe  Is  found  to 
be  0.99751951. 

Since  a  system  failure  occurs  only  If  both  mainframes  fall,  we  are 
Interested  In  the  probability  that  at  least  one  mainframe  Is  working,  which 
from  Eq.  (3)  Is 

1  -  (1-A)(1-A)  -  2A  -  A^ 

■  2(0.99751943)  -  (0.99751943)^ 

-  0.99999385. 

Table  A-3  shows  the  details  of  the  calculations  of  the  availability  of 
the  rehost  system.  At  least  1  of  the  2  mainframes,  TCU's,  and  SCU's  must  be 
working  to  prevent  a  system  failure.  The  failure  rates  and  availability 
formulas  are  given  for  the  TCU's  and  SCU's  (but  not  for  the  mainframe  since 
the  special  calculation  above  replaces  the  formula).  The  last  column  shows 
the  availability  of  each  subsystem,  and  the  availability  of  the  rehost 
system  Is  0.99999283,  the  product  of  these  availabilities. 
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A. 3  System  HTBF ;  Hardware 


A. 3.1  Principles 

This  subsection  shows  how  system  MTBF  can  be  estimated  from  Information 
that  Is  known. 

The  definition  of  availability  Is 

MTBFj 

*s  “  HTBFj  +  HHRj  * 

where  the  subscript  s  Is  used  to  show  we  are  talking  about  the  system.  This 
can  be  rearranged  to 


HTBF^  ■  • 


(4) 


Eq.  (4)  expresses  the  system  MTBF  as  a  function  of  system  availability  and 
the  system  MTTR.  The  values  for  system  availability  are  known;  they  are 
derived  In  Sec.  A. 2.  Therefore,  once  the  system  MTTR  Is  known,  system  MTBF 
can  be  estimated.  Note  that  system  MTTR  Is  the  same  thing  as  the  expected 
duration  of  a  system  failure.  The  method  used  to  estimate  the  system  MTTR 
will  now  be  explained. 


One  might  think  that  because  the  MTTR  for  each  unit  Is  1  hour,  the  MTTR 
for  the  system  would  also  be  1  hour,  but  this  Is  Incorrect.  To  see  this, 
suppose  that  one  unit  falls;  this  does  not  cause  a  system  failure  because  of 
redundancy.  Then  suppose  that  a  second  unit  falls;  this  does  cause  a  system 
failure.  Since  the  repair  times  are  distributed  exponentially,  and  hence 
memoryless,  we  can  let  t"0  be  the  time  of  the  second  failure.  If  the  system 
failure  were  caused  by  one  unit  falling  and  If  the  failure  ended  when  that 
unit  was  repaired,  then  the  system  MTTR  would  be  1  hour.  However,  since  Z 
units  have  failed  and  since  the  system  failure  ends  when  either  unit  Is 
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repaired,  the  expected  duration  of  the  KTTR  Is  1/2  hour.  That  Is,  If  t^ 
Is  the  time  that  the  first  unit  Is  repaired,  and  t2  Is  the  tine  that  the 
second  Is  repaired,  then  the  duration  of  the  failure  Is  n1n(tj^,t2),  and 
the  expectation  of  n1n(t^,t2]  Is  1/2.  This  Is  an  Implication  of  the 
following  theorem. 

Theorem;  If  t^ .  t^  are  Independent,  exponential  random 

variables  with  means  1/uj^ . 

nean  of  min  {tj^,...,tjj)  Is  l/l(»j+...+ii^)' 


Proof; 

Since  the  n  random  variables  t^ . t^  are  Independent,  for  ai\y  t 

PrCmlnltj^,  ...  ,  tjj)>t]  ■  PrCtj>t]  x  ...  x  Pr[t|^>t] 

*  6““!^  X  ...  X  e'’‘*2^ 

-  e’^'‘l*—'''‘n’^ 

The  last  equality  Implies  that  the  distribution  function  of 
Ctaln(tj^,...,tjj)<t]  Is  l-e”^'‘l*'’**'*n^^,  which  Is  the  distribution 
function  of  an  exponential  dIstrU'utlon  with  mean  l/(K^-»-...-t’ii^).  This 
completes  the  proof. 

In  the  baseline  case  we  assume  that  the  unit  MTTR  Is  1  hour,  I.e.,  n  ■ 
1.  This  theorem  then  Implies  that: 

e  the  system  MTTR  is  1/2  hour  If  the  system  failure  Is  caused  by  the 

failure  of  two  units  (e.g.,  by  2  90200  CE's  or  by  two  rehost  TCU's); 

e  the  system  MTTR  Is  1/3  hour  If  the  system  failure  Is  caused  by  the 

failure  of  three  units  (e.g.,  by  a  pair  of  channels  falling  on  one 

rehost  mainframe  and  the  CPU  falling  In  the  other  rehost  mainframe); 
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•  the  system  MTTR  Is  1/4  hour  If  the  system  failure  Is  caused  by  the 
failure  of  four  units  (e.g.,  by  a  pair  of  channels  falling  on  each 
rehost  mainframe). 

The  MTTR  for  each  system  Is  calculated  In  the  next  subsection. 

A. 3. 2  Results 

90200/9020E  system.  The  MTBF  of  a  system  with  a  90200  In  the  CCC  and  a 
9020C  In  the  display  channel  will  now  be  estimated.  Subsec.  A. 2. 2  estimates 
the  availability  of  the  90200/9020E  system  to  be  0.99998301.  Since  this 
system  falls  If  and  only  If  two  like  units  fall,  the  theorem  of  Subsec. 

A. 3.1  1nq>11es  that  the  system  MTTR  Is  1/2.  Eq.  (4)  can  now  be  used  and  the 
system  MTBF  Is 

A- 

MTBF  ■  1—  ■  MTTR 
s  1-Aj  s 

_  0.99998301  ..  1 

r-0.&S§§8'5gf  ^  "T 

«  29,429  hours 

-  1226  days. 

Therefore,  the  estimate  of  the  MTBF  of  the  system  with  a  90200  In  the 
CCC  and  a  9020E  In  the  display  channel  Is  1226  days.  Each  system  outage  Is 
expected  to  last  a  half  hour. 

Rehost  system.  The  MTBF  of  the  rehost  system  will  now  be  estimated. 

The  availability  of  the  rehost  system  Is  estimated  In  Subsec.  A. 2. 2  to  be 
0.99999283.  It  Is  claimed  that  the  system  MTTR  does  not  differ 
significantly  from  1/2;  this  claim  Is  substantiated  below.  The  system  MTBF 
can  now  be  estimated  using  eq.  (4). 

"'“^s  -  ^«s 

_  0.99999283  ..  1 

l-0.998§gZ83  *  ~r 
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=  69>735  hours 
-  2905  d^ys. 

Therefore,  the  MTBF  of  the  rehost  system  Is  estimated  to  be  2905  days.  Each 
system  outage  Is  expected  to  last  a  half  hour. 

All  that  remains  Is  to  substantiate  the  claim  that  the  MTTR  for  the 
rehost  system  does  not  differ  significantly  from  1/2.  The  reason  why  the 
rehost  system  MTTR  Is  not  exactly  1/2  is  because,  unlike  the  9020D/E  system, 
not  all  system  failures  are  caused  by  two  units  falling.  There  are  three 
cases.  First,  a  system  failure  might  result  from  the  failure  of  two  units, 
I.e.,  both  soil's  fall,  both  TCU's  fall,  both  CPU's  fall,  both  memories  fall, 
or  a  CPU  falls  on  one  mainframe  and  a  memory  falls  on  the  other.  In  this 
case  the  system  MTTR  Is  1/2  hour.  Second,  a  system  failure  might  result 
from  the  failure  of  three  units,  I.e.,  a  pair  of  channels  falls  on  one 
mainframe  and  the  CPU  or  memory  falls  on  the  other.  In  this  case  the  system 
MTTR  Is  1/3  hour.  Third,  a  system  failure  might  result  from  the  failure  of 
four  units,  I.e.,  a  pair  of  channels  falls  on  each  mainframe.  In  this  case 
the  system  MTTR  Is  1/4  hour. 

The  system  MTTR,  therefore.  Is 

MHRs  -  (1/2)P2  +  (1/3)P3  +  (l/4)p^,  (5) 

where  p^  Is  the  probability  that  the  system  failure  Is  caused  by  the 
failure  of  1  units,  given  that  a  system  failure  occurs.  To  show  that  the 
rehost  system  MTTR  does  not  differ  significantly  from  1/2,  the  weights  p^, 

P3,  and  p^  will  now  be  found.  Table  A-4  shows  the  relevant 
Information.  The  first  column  shows  the  ways  that  a  system  failure  can 
occur.  For  example,  "CPU  x  CPU"  means  that  the  CPU's  fall  In  both 
mainframes.  Since  this  can  only  happen  In  one  way,  a  1  Is  written  In  the 
second  column.  "CPU  x  Mem."  means  that  the  CPU  falls  In  one  mainframe  and 
the  memory  In  the  other;  this  can  happen  In  two  ways  since  the  CPU  can  fall 
In  either  mainframe.  "CPU  x  Ch."  means  that  the  CPU  falls  In  one  main  frame 
and  a  pair  of  channels  falls  In  the  other.  Since  each  main  frame  has  six 
pairs  of  channels,  there  are  12  different  ways  that  "CPU  x  Ch."  can  occur. 
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TABLE  A-4:  PROBABILITIES  OF  OIFFEREMT  TYPES  OF  SYSTEM  FAILURES 
IN  THE  REHOST  SYSTEM 


Number  of 

Uncondl tional 

Conditional 

Two  unit  failures 

Combinations 

Probablll^ 

5.431x10"^ 

Probability 

CPU  X  CPU 

1 

0.0757 

CPU  X  Mem. 

2 

2.567x10”® 

0.3576 

Men.  X  Men. 

1 

3.032x10”® 

0.4223 

TCU 

1 

1. 000x10”® 

0.1393 

SCU 

Subtotal 

1 

2.000x10”® 

0.0028 

0.9977 

Three  unit  failures 

CPU  X  Ch. 

12 

5.129x10“® 

0.0007 

Men.  X  Ch. 

Subtotal 

12 

1.212x10“® 

0.0017 

0.0024 

Four  unit  failures 

Ch.  X  Ch. 

36 

1.211x10-11 

0.0000 

-6 


Total 


7.179x10' 


1.0001 


The  rest  of  the  failures  shown  In  Table  £-4  have  similar  explanations  except 

for  the  TCU  and  SCU;  TCU,  for  example,  means  that  both  tape  units  fall.  ‘ 

1 

J 

The  availability  for  a  single  CPI)  Is  0.99926308,  for  a  single  memory  Is 
0.9982586,  and  for  a  single  pair  of  channels  Is  0.99999942;  these  figures 
are  from  Table  A-2  (where  the  channel  availability  Is  obtained  by  taking  the 
sixth  root  of  the  number  shown).  To  Illustrate  the  method  of  calculating  I 

the  unconditional  probability  of  any  particular  type  of  mainframe  failure, 
consider  the  CPU  x  Mem.  failure.  Since  the  event  of  a  CPU  falling  Is 
Independent  of  the  event  of  a  memory  falling,  and  since  there  are  two 
different  ways  a  CPU  x  Mem.  failure  can  occur,  the  probability  of  a  CPU  x 
Mem.  failure  Is 

(1  -  0.99926308)(1  -  0.99825860)2  «  2.567xl0~^. 

i 

The  Interpretation  of  this  probability  Is  that  If  a  point  In  time  Is  chosen 
at  random,  then  the  probability  of  a  CPU  x  Mem.  failure  obtaining  at  that 

point  1$  2.567x10'^.  The  other  unconditional  probabilities  are  similarly  < 

calculated  and  entered  In  Table  A>4.  The  exception  Is  that  the  TCU  and  SCU 
unconditional  probabilities  are  merely  one  minus  the  availabilities  In  Table 
A-3.  The  sum  of  the  probabilities  Is  7.179x10*®,  which  Is  the  probability 
of  the  system  being  down.  (The  availability  of  the  two  mainframes  Is  then 

1-7.179x10"®  «  0.99999282, 

which  checks  with  the  number  In  Table  A-3  derived  by  a  different  method.) 

Dividing  each  unconditional  probability  In  Table  £-4  by  7.179x10"® 
gives  the  conditional  probabilities  shown  In  the  last  column.  For  example, 
given  that  there  Is  a  system  failure,  the  probability  that  this  Is  a  CPU  x 
CPU  failure  Is  0.0757. 

The  subtotals  In  Table  A-4  give  the  values  for  the  weights  p^,  p^, 
and  p^.  £q.  (5)  now  becomes 

MHRj  -  (1/2)P2  +  (1/3)P3  +  (l/4)p^ 
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(l/2)(0.9977)  +  {l/3)(0.0024)  +  (1/4)0 
0.4998. 


It  Is  seen  that  the  system  KTTR,  rounded  to  three  significant  figures, 
equals  1/2.  This  completes  the  arguoent  that  the  rehost  system  MTTR  does 
not  differ  significantly  from  1/2. 

A. 4  System  Avallabnity  and  System  MTBF;  Hardware  and  Software 

Sec.  2.5  contains  estimates  of  the  system  availability  and  MTBF  that 
take  Into  account  not  only  hardware  failures  but  also  software  failures. 
This  section  shows  how  these  estimates  are  obtained.  That  Is,  this  section 
shows  how  the  methods  of  Sec.'s  A. 2  and  A. 3,  which  only  dealt  with  hardware, 
can  be  extended  to  cover  the  case  of  both  hardware  and  software.  Again  the 
calculations  will  be  Illustrated  using  the  baseline  assumptions.  The 
90200/9020E  system  and  the  rehost  system  are  considered  separately. 

90200/9020E  system.  The  analysis  proceeds  in  five  steps.  First, 
estimate  the  mean  duration  of  a  software  outage.  It  is  assumed  In  Sec.  2.5 
that,  given  that  there  Is  a  software  failure,  the  system  outage  Is  0.5 
minute  with  probability  0.9  and  Is  15  minutes  with  probability  0.1.  The 
mean  outage  due  to  a  software  failure  Is  then 

(0.5  X  0.9)  (IS  X  0.1)  >  1.95  min. 

Second,  estimate  the  software  availability.  Sec.  2.5  assumes  that  the 
MAS  software  has  the  same  KT3F  as  the  90200/9020E  hardware;  this  MTBF  Is 
1226  days  In  the  baseline  case.  Since  1226  days  contains  1,765,440  minutes, 
the  software  availability  Is 

•  "-WSSSSM. 

Third,  estimate  system  availability.  Since  the  system  works  only  If 
both  the  hardware  and  the  software  work  and  since  the  hardware  availability 
Is  0.99998301,  eq.  (2)  Implies  that  the  system  availability  considering  both 
hardware  and  software  Is 
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0.99998301  x  0.99999890  -  0.99998191. 


Fourth,  estimate  the  Siystem  KTTR,  l.e.,  the  mean  duration  of  a  system 
outage.  Since  It  Is  assumed  that  the  number  of  system  failures  caused  by 
hardware  equals  that  caused  by  software,  since  the  mean  duration  of  a 
hardware  outage  Is  a  half  hour,  and  since  the  mean  duration  of  a  software 
outage  Is  1.95  minutes,  the  mean  duration  of  a  system  outage  Is 

mTRg  >  (1/2  X  1/2  hr)  +  (1/2  x  1.95  min  x  1  hr/60  min) 

-  0.26625  hr. 


Fifth,  estimate  system  MTBF  using  eq.  (4). 


MTBF^ 


MHRj 


_  0.99998191  .. 
1-5.^9598191  * 


0.26625 


-  14718  hr 


*  613  days. 

Rehost  system.  The  analysis  of  Sec.'s  A.2  and  A. 3  must  be  changed  In 
two  ways  —  to  Include  failures  In  the  virtual  machine  monitor  (VMH)  and  In 
the  NAS  software.  Vfti  will  be  treated  as  another  component  of  a  mainframe 
just  like  a  CPU  or  a  memory;  this  Is  because  If  the  vm  In  one  mainframe 
falls,  then  processing  Is  switched  to  the  other  mainframe,  and  the  first 
mainframe  Is  out  while  being  restarted.  The  analysis  proceeds  In  six  steps. 


First,  estimate  the  availability  of  the  VMM.  With  a  failure  rate  of 
twice  per  month,  the  VMM  runs  on  average  for  15  days  (360  hours)  and  Is  then 
down  for  1/6  hour.  The  VMM  availability  then  Is 

0.99953725. 

Second,  estimate  the  availability  of  a  single  mainframe  (Including  the 
VMM).  Since  the  CPU,  memoryt  channels,  and  VMM  must  all  be  working  If  the 
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mainframe  is  to  vioric,  the  mainframe  availability  Is  the  availability  of  the 
first  three  (which  from  Table  A-2  1$  0.99751951)  times  the  availability  of 
the  VMM,  or 


0.99751951  X  0.99953725  -  0.99705791. 

Third,  estimate  the  availability  of  the  two  mainframes,  l.e.,  the 
probability  that  at  least  one  mainframe  Is  working.  Eq.  (3)  Implies 

2(0.99705791)  -  (0.9970591)^  «  0.99999134. 


Fourth,  estimate  the  availability  of  the  rehost  system.  It  Is  the 
product  of  the  availabilities  of  the  mainframes,  the  TCU,  the  SCO's,  and  the 
NAS  software.  We  have 

0.99999134  x  0.99999900  x  0.99999998  x  0.99999890  >  0.99998922. 

Fifth,  estimate  the  mean  duration  of  a  system  outage.  Table  A-5,  which 
Is  analogous  to  Table  A-4,  shows  the  different  types  of  system  failure,  each 
one's  mean  time  to  repair  (which  assumes  that  the  repair  times  are 
exponentially  distributed  and  uses  the  theorem  In  A. 3.1),  and  the 
probability  of  each  type  of  failure  given  that  there  Is  a  system  failure. 
These  conditional  probabilities  are  used  as  weights  In  taking  a  weighted 
average  of  the  mean  repair  times,  and  the  resulting  system  HTTR  is 

(1/2  X  0.6628)  +  (1/3  X  0.0016)  +  (1/4  x  0)  +  (1/7  x  0.2123)  +  (1/8  x  0.0018) 
+  (1/12  X  0.0198)  +  (1.95/60  x  0.1018)  -  0.3674  hr. 


Sixth,  estimate  the  system  MTBF  using  eq.  (4). 


MTBFj 


MHRj 


0.99998922 

110595552?  * 


0.3674 


34,081  hours 
1420  days. 
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TABLE  A-5:  PROBABILlYlES  OF  DIFFERENT  TYPES  OF  SYSTEM  FAILURES 
IN  THE  REHOST  SYSTEM 


Number  of 

Unconditional 

Conditional 

HTTR 

Type  of  Failure 

Combinations 

Probability 

Probability 

1/2 

CPU  X  CPU 

1 

5.431x10"^ 

0.0503 

1/2 

CPU  X  Men. 

2 

2.567x10"® 

0.2375 

1/2 

Mem  X  Mem. 

1 

3.032x10"® 

0.2806 

1/2 

TCU 

1 

1.000x10"® 

0.0925 

1/2 

SCU 

1 

2.000x10"® 

0.0019 

Subtotal 

0.6628 

1/3 

CPU  X  Ch. 

12 

5.129x10"® 

0.0005 

1/3 

Mem.  X  Ch. 

12 

1.212x10"® 

0.0011 

Subtotal 

0.0016 

1/4 

Ch.  X  Ch. 

36 

1.211x10"^^ 

0.0000 

1/7 

CPU  X  VM 

2 

6.820x10"^ 

0.0631 

1/7 

Mem.  X  VM 

2 

1.612x10"® 

0.1492 

Subtotal 

0.2123 

1/8 

VM  X  Ch. 

12 

1.932x10"® 

0.0018 

1/12 

VM  X  VM 

1 

2.141x10"^ 

0.0198 

1.95/60 

NAS  Software 

1 

1.100x10"® 

0.1018 

Total 

1.081x10"® 

1.0001 

B.  SYSTEM  MTBF  WITHOUT  REPAIRS:  DETAILED  ANALYSIS 


B.l  Purpose  and  Organizaton  of  this  Appendix 

The  purpo?.e  of  tnls  appendix  1$  to  show  how  the  estimates  In  Sec.  2.3  of 
system  MTBF  are  obtained  under  the  assumption  that  no  repairs  are  made. 

Sec.  B.2  derives  the  equations  for  the  system  with  a  9020D  In  the  CCC  and  a 
9020E  In  the  display  channel.  Sec.  B.3  then  derives  the  equations  for  the 
rehost  system.  Sec.  B.4  describes  the  approximation  used  to  obtain  the 
system  MTBF.  This  appendix  only  considers  hardware  failures. 

Throughout  this  appendix  reliability  Is  defined  to  mean  the  probability 
that  there  has  not  been  a  failure  after  a  stated  period  of  operation. 

B.2  Reliability  of  the  9020D  or  9020E  Configuration 

The  90200  system  Is  working  If  all  of  the  following  five  conditions  hold: 

e  at  least  2  of  the  3  CE's  are  working; 

e  at  least  5  of  the  6  SE's  are  working; 

e  at  least  2  of  the  3  IXE's  ai*e  working; 

#  at  least  2  of  the  3  TCU's  are  working; 

e  at  least  2  of  the  3  SCU's  are  working. 

It  Is  assumed  that  the  9020E  has  the  same  reliability  as  the  9020D  since 
these  two  systems  have  the  same  configuration  (except  that  In  the  9020E  some 
of  the  SE's  are  replaced  by  display  elements). 

The  function  r(t)  Is  used  to  denote  reliability  for  a  single  component, 
e.g..  Individual  CE's.  SE's.  lOCE's,  TCU's.  and  SCU's.  The  function  R(t) 

Is  used  to  denote  reliability  for  a  subsystem  or  system.  Subscripts  are 
used  to  distinguish  different  reliability  functions.  For  example. 

Is  the  probability  that  a  single,  specified  CE  has  not  failed  after  t  hours 
of  operation;  R^(t}  Is  the  probablll^  that  at  most  I  of  the  3  CE's  have 
failed. 
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Exponential  failure  rates  are  asstaned  for  each  component.  Therefore, 
for  any  component 

r(t)  ■  e"^^  (1) 

where  x  Is  the  failure  rate. 

For  the  subsystems  containing  three  components  with  one  redundant  (I.e., 
the  CE,  lOCE,  TCU,  or  SCU),  that  subsystem  will  function  If  all  three 
components  function  or  If  any  two  components  function.  Mathematically, 


*CE  , 

-  r^  +  3r^(l-r), 

(2a) 

*I0CE 

-  r^  +  3r^(l-r), 

(2b) 

*TCU 

■  r^  +  3r^(l-r),  and 

(2c) 

*SCU 

-  r^  +  3r^(l-r), 

(2d) 

For  clarity,  the  subscripts  have  been  dropped  on  the  right  hand  side  and 
the  t's  have  been  dropped  on  both  sides.  Eq.  (2a)  can  be  explained  In  the 
following  way.  Rq£  Is  the  probablllly  that  not  more  than  one  CE  will 
fall.  This  probability  Is  calculated  by  substituting  Into  (2a)  the  value  of 
r  obtained  by  substituting  Into  eq.  (1)  the  relevant  failure  rate,  I.e., 

^CE* 


For  the  storage  elements  where  we  have  a  total  of  six  elements,  the 
appropriate  expression  Is 

R5£  ■  r®  ♦  6r®(l-r).  (3) 

The  overall  system  reliability  for  the  90200  system  Is  found  by 
combining  equations  (2)  and  (3)  to  obtain 


*90200 


+  3r^(l-r££)]  [r^  +  ^'*TCU  *  ^'’TCU 

[r^y  ♦  3r^y(l-r5£y)]  CrJgj.£  +  3rjQ££  tl-'’ixE 
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)].  (4) 


That  Is,  the  system  reliability  Is  the  product  of  the  probabilities  that  no 
more  than  one  component  failure  occurs  In  apy  of  the  subsystems. 

The  reliability  of  a  system  with  a  90200  In  the  CCC  and  a  90200  In  the 
display  channel  Is  the  product  of  the  reliability  of  the  90200  and  the 
reliability  of  the  902QE,  l.e., 

**90200/9020E^^^  "  [^0200^^^^^* 

B.3  Reliability  of  the  Rehost  Configuration 

The  rehost  system  contains: 

e  two  mainframes,  where  a  mainframe  Is  defined  to  Include  a  CPU,  an  SE, 
and  twelve  channels; 

#  two  TCU*s: 
e  two  SCU's. 

The  rehost  system  Is  operating  successfully  If  all  three  of  the  following 
conditions  are  satisfied: 

0  at  least  one  mainframe  Is  operating; 

•  at  least  one  TCU  Is  operating;  and 
e  at  least  one  SCU  Is  operating. 

Since  a  mainframe  operates  only  If  Its  CPU,  Its  SC,  and  at  least  one 
channel  In  each  of  the  six  pairs  operates,  the  reliability  of  a  single 
mainframe  Is  given  by 

•Si "  ^:pu  ’^se  “ch.  (S) 
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since  there  Is  only  one  CPU  and  one  SE  In  the  mainframe,  these  components 
have  simple  exponential  failure  rates.  The  reliability  for  the  six  pairs  of 
channels  Is 


-  [r^  +  2r(l-r)]®. 


(7) 


Therefore,  eq.  (6)  now  Implies  that  the  reliability  for  a  single  mainframe  Is 
Si  “  "^CPU  '‘SE 

For  the  TCU's  and  SCU's,  one  of  two  must  function,  so  the  reliabilities  of 
these  subsystems  are 

S’cu  *  '"tcu  ^’'‘tcu^^"'‘tcu^» 

*^scu  “  ’’scu  *  2'"xu^^"'’scu^* 

The  reliability  of  the  complete  rehost  system  Is  the 

(11) 

where  the  complete  expression  Is  obtained  by  substituting  from  (8),  (9),  and 
(10)  Into  (11). 

B.4  Approximating  System  MTBF 

The  reliability  as  a  function  of  tliM  Is  given  for  the  9020D/9020E 
system  by  eq.  (5)  and  for  the  rehost  system  by  eq.  (11).  The  method  used  to 
extract  from  these  functions  the  system  NTBF's  shown  In  Tables  2*5  and  2>6 
Is  as  follows.  For  a  truly  exponential  reliability  function,  the  MTBF  Is 
the  time  at  which  reliability  Is  equal  to  0.37  (I.e.,  to  1/e).  In  the 
present  case,  even  though  each  unit  has  an  exponential  reliability  function, 
because  of  redundancy  the  system  reliability  function  Is  not  exponential. 
Meverthcless,  the  time  at  which  reliability  equals  0.37  Is  used  to 


(9) 

(10) 


approxloiate  the  system  NTBF.  Therefore,  the  tines  shown  In  Tables  2-5  and 
2-6  are  only  approximate  MTBF's;  strictly  speaking,  these  are  the  times  that 
elapse  between  when  the  system  begins  running  and  when  the  reliability  drops 
to  0.37. 


C.  PAiaiAl.  KBPLACBMEMT 

Tha  body  of  this  rsport  has  assuasd  that  if  rahosting  is  ad<^tad>  than 
tha  9020 's  will  ba  caplaead  at  all  twanty  hRtCC's.  Howavatf  sinca  bardwara 
is  tha  major  coat  (saa  Sac.  S.7),  it  has  baan  suggastad  that  tha  9020 's  only 
ba  raplacad  at  thosa  cantars  that  faca  capacity  problaas)  proponants  of  this 
idaa  claim  that  this  partial  raplaeamant  %K>uld  taka  cara  of  tha  capacity 
problems  whila  minimising  the  coat.  Tha  purpose  of  this  appendix  is  to 
point  out  tha  advantages  and  disadvantages  of  partial  raplaeamant. 

Thera  are  three  main  disadvantages  to  partial  raplaeamant.  First,  with 
two  entirely  different  syatou  in  tha  field,  support  would  ba  greatly 
complicated  sinca  training  at  the  hcadamy,  inventory  management  at  tha 
Depot,  and  support  at  tha  Technical  Canter  would  need  to  ba  carried  out  for 
tha  different  systems.  Second,  sinca  both  the  level  of  air  traffic  and  tha 
lifetime  of  tha  rahost  system  are  hard  to  predict,  there  is  scsm  doubt  as  to 
exactly  which  cantars  will  face' capacity  problems  and  «fhich  will,  tharafora, 
require  a  new  system.  Third,  if  the  view  is  taken  that  tha  rahost  system 
will  evolve  into  a  full  replacement  system  (saa  Ch.  8),  than  money  will 
probably  not  ba  saved  by  partial  replacement.  The  expenditure  on  new 
hardware  could  not  be  eliminated r  it  could  only  be  delayed. 

The  main  advantage  of  partial  replacement  is  the  cost  saving.  An 
additional  advantage  is  that  the  number  of  transitions  could  be  held  down. 
Some  of  the  relevant  costs  will  be  estimated,  but  it  should  be  stressed  that 
soma  costa  cannot  be  quantified,  e.g.,  costa  due  to  the  inconvenience  or 
confusion  of  having  multiple  systems.  Therefore,  the  discussion  below 
should  be  thought  of  as  a  treatment  of  some  of  tha  costs  and  not  as  a 
complete  treatment.  Six  areas  in  which  the  cost  of  partial  replacement 
differs  from  that  of  total  replacuMnt  will  now  be  discussed.  For 
concreteness,  assume  that  Amdahl  470/V7's  are  the  mainframes  that  are 
procured.  Consider  the  cases  where  replacement  is  at  5  and  10  ASSCC's. 

First,  since  new  systems  need  not  be  purchased  for  the  centers  which 
retain  the  9020 'a,  there  is  a  saving  in  tha  hardware  aoquisition  cost. 

Since  this  cost  is  $5.4  million  par  center  (see  Table  5-9),  the  total  saving 
compared  to  total  replacement  would  be  $81.0  million  if  there  is  replacement 
at  5  centers  and  $54  million  if  there  is  replaeament  at  10. 
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S«cond,  an  initial  invantory  of  naw  spata  parts  naad  not  ba  laid  in  for 
tha  cantars  which  ratain  tha  9020*8.  Sinca  tha  stock  of  sparas  is  astiaatad 
to  ba  $1.3  ailllon  par  cantar  (saa  Subsac.  5.4.3),  tha  saving  ralativa  to 
total  raplacaaant  is  $19.5  ailllon  or  $13.0  aillion  if  raplacaaant  wars  at  5 
or  10  cantars,  raspactlvaly. 

Third,  no  transition  coat  would  ba  incurrad  at  cantars  at  idiich  thare  is 
no  raplacaaant.  Tha  par  cantar  saving  is  $1  million  on  reaodallng  cost, 
$200,000  in  axtra  parsonnal  cost,  and  $370,000  for  training  cost  (see  Sac. 
5.5).  Tha  saving  in  transition  cost  is  then  $1.6  aillion  pec  canter,  for  a 
total  of  $24.0  ailllon  or  $16  aillion  if  raplacaaant  is  at  5  or  10  centers, 
raspactlvaly. 

Fourth,  since  tha  procuraaant  would  not  taka  as  long  if  theca  were 
raplacaaant  at  fewer  cantars,  thara  will  ba  a  saving  in  tha  program 
aanagaaant  and  support  cost.  Tha  procuraaant  would  ba  shortened  by  18  or  12 
months  if  thara  ware  raplacaaant  at  5  or  10  cantars,  raspactlvaly.  Since 
tha  program  aanagaaant  is  axpactad  to  cost  $6.3  ailllon  par  year  during 
daployaant  (saa  Table  5>19) ,  this  aaans  cost  saving  would  ba  $9.5 
aillion  or  $6.3  ailllon,  respectively,  fow  tha  two  cases. 

Fifth,  if  there  is  no  caplacaaent  at  a  cantar,  than  it  does  not  reap  tha 
annual  saving  in  aaintananca  cost  provided  by  the  aoca  callable  new  systaa. 
After  tha  ^7sa*yaar  shakedown  period,  tha  saving  in  aaintananca  personnel 
cost  is  $412,292  par  center  (see  Table  5-10).  Tharafora,  the  annual  cost 
penalty  of  not  replacing  is  $6.2  ailllon  or  $4.1  aillion,  depending  on 
whathac  raplacaaant  is  at  5  or  10  centers.  Kith  full  raplacaaant,  tha 
annual  parts  saving  is  $1,013,000  aillioni  $0.8  or  $0.5  aillion  of  this 
would  not  be  saved  if  raplacaaant  ware  only  at  5  or  10  cantars.  Thus,  tha 
total  annual  oost  penalty  paid  is  $7.0  ailllon  if  thara  is  raplacaaant  at  5 
cantars  or  $4.6  aillion  if  thara  is  caplacaaent  at  10. 

Sixth,  axtra  parsonnal  would  ba  raquirsd  at  the  Technical  Center  sinca 
two  aystaas  would  naad  support.  Table  C-1  shows  tha  nuabar  of  additional 
people  that,  it  is  astiaatad,  would  ba  raguicad  at  the  Technical  Cantar  if 
thara  ware  partial  raplaeaaant.  Tha  total  annual  cost  of  these  axtra 


p«rsoniMl  !•  coandad  to  $3  ■llllon.  (This  tabl*  assuMs  that  tha  avaraga 
grada  ia  6S-13,  atap  4,  In  AP;  it  ia  halfway  batwaan  G8-13,  atap  4^  and  GS 
I4»  atap  4,  in  XT  and  RaD}  and  it  ia  G8-11  Stap  4,  in  ACT-700.  Tha  aalariaa 
foe  GS-lIf  13(  and  14,  Stap  4,  aca  eurrantly  $24,736,  $35,2S2,  and  $41,657 
caapaetivaly.  To  thaaa  aalariaa  10  parcant  haa  baan  addad  to  covar  banafita 
and  S  parcant  to  eowae  ovaetiaw.) 

Thaaa  figuraa  ara  atmariaad  in  Tabla  C-2.  Coaparad  to  full 
raplaeaaant,  tha  aaving  in  initial  coat  la  $134.0  aillion  if  thara  ia 
raplacaaMnt  at  5  cantata  and  $89.3  aillion  if  thara  ia  raplacnant  at  10. 

In  otbar  worda,  tha  initial  coat  ia  $107.0  aillion,  $151.7  aillion,  or 
$241.0  aillion  dapanding  on  whathar  raplacaaant  ia  at  5,  10,  or  20  cantata. 
CciVarad  to  full  raplacaaMnt,  tha  long-tara  Incraaaa  in  tha  annual  coat  ia 
$10.1  aillion  if  thara  ia  raplacaaant  at  5  cantata  and  $7.7  aillion  if  thara 
ia  raplacaaant  at  10.  In  othar  worda,  with  full  raplacaaant  tha  aaving  in 
annual  coat  ia  $9.3  aillion;  with  raplacaaant  at  10  cantata  tha  aaving  la 
$1.6  aillion;  with  raplacaaant  at  5  cantata  tha  annual  coat  riaaa  by  $0.8 
aillion. 
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TABLE  C-li  IMCBBASBO  AMmiAL  PBBSOMIIEL  COST  AT  THE  TECHNICAL  CENTBE  IP  THEBE 


IS  PARTIAL  REPLACEMENT 


Nuab«r  of  N«w 


Orcaniaation 

Pataonnal 

Avacaca  Coat 

Total  Coat 

AT 

30 

$44,223 

$1,326,640 

AP 

20 

40,540 

810, 80C 

R  a  0 

10 

44,223 

442, 23C 

ACT- 700 

20 

28,446 

568,920 

Total 

$3,148,640 

TABLE  C*2t  AMOONT  SAVED  IP  THERE  IS  PARTIAL  REPLACEMENT 

(■ililOM) 

Mmh^p  of  c«nt«ra  at  irtttch  th«f  !■  Raolaew nt 


Ona-Tiaw  Coat  Savina 

5 

10 

Hardwaca  acquiaition 

$  81.0 

$  54.0 

Initial  parta  invantory 

19.5 

13.0 

Tranaition  coat 

24.0 

16.0 

Pcogcaai  aanaoaaMnt 

9.5 

6.3 

Total 

$134.0 

$  89.3 

Amual  Coat 

Naintananca  Coat 

($  7.0) 

($  4.6) 

Extra  pataonnal  at  tha  PAATC 

(  3.1) 

(  3.1) 

Total 

($10.1) 

($  7.7) 

N.B.  A  fivur*  in  paE«nth«tM  d«not««  a  coat  inccaaan  cathar  than  a  coat 


D.  WX  SPECIAL  HABONABB  IS  NBSOBO  IN  THE  BBB08T  SYSTEM 


Tha  cahost  systaa  aa  dascrlbad  in  Sac.  1.3  containa  two  places  of 
special  hacdwara~tha  radar  input  line  atultiplaxoc  (RIN/LM)  and  tha  display 
buffer.  This  appendix  states  why  this  special  hardware  is  needed  and  what 
its  function  is. 

Any  rehost  eoaputer  systea  aust  be  capable  of  interfacing  to  the  radar 
circuita  and  the  display  generators.  The  radar  circuits  provide  raw  data  on 
the  location  and  identity  of  controlled  aircraft.  The  display  generators 
aaintain  the  geo-situation  plot  for  the  controller  suites.  In  the  current 
9020  systea,  both  of  these  interfaces  are  supported  with  special  purpose 
extensions  of  the  basic  systea.  That  is,  the  radar  input  processing  prograa 
provides  device  support  for  the  radar  circuits  and  runs  continually  in  an 
lOCE  after  it  is  dispatched  during  startup/startover .  All  other  devices  are 
supported  in  a  traditional  interrupt-driven  Banner.  -  The  display  generators 
are  supported  with  special  access  to  display  buffers  in  the  display  channel 
aeaory  to  ensure  adequate  response  for  the  display  refresh  process.  Each  of 
these  interfaces  represents  special  problwas  for  a  rshost  systwa. 

BIM/LM.  If  the  radar  circuits  were  directly  connected  to  a  rehost 
coaiputer  systeai,  then  the  Bainfraaw  would  be  required  to  provide  support 
equivalent  to  the  current  BIN  support.  The  alternatives  for  this  radar 
input  support  are  either  to  allow  a  channel  program  to  run  continually 
(equivalent  to  the  current  approach)  or  to  use  the  traditional  interrupt 
driven  support.  Either  of  these  alternatives  would  severely  coapromise  the 
performance  of  the  mainfrasM  due  to  the  frequency  and  response  requirements 
for  radar  data  processing. 

An  slternative  for  supporting  radar  data  input  is  to  provide  a 
pre-processor  for  esch  radar  circuit  that  would  perform  the  BIN  function  and 
present  valid  and  reformatted  radar  data  to  the  mainframe  in  blocked  records 
so  that  one  mainframe  I/O  operation  would  access  seversl  radar  data  values. 
This  vptoaeh  would  ensure  adequste  capacity  to  process  the  raw  radar  data 
and  avoid  potential  performance  problems  with  the  mainframe. 
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(It  should  bs  Mntionsd  that  Aadahl  has  studiad  tha  radar  input  problaa 
and  has  tsntatlvaly  dacldad  that  tha  UM  program  can  run  in  tha  aainfraaa 
without  causing  a  parforaanca  problaa.  X£  this  analysis  provas  corract, 
than  tha  MN/LN  can  ba  caittad  frca  tha  rahost  systaa.) 

Display  buffar.  Tha  display  ganarators  raquira  par iodic  aecass  to 
display  buCfars  which  daCina  tha  gao-situation  plot  so  that  tha  plan  viaw 
displays  in  tha  controllar  suitaa  can  ba  dynamically  updated.  In  tha 
currant  9020  systaaf  tbasa  display  buffers  are  provided  as  part  of  tha 
aeaory  in  tha  display  channels  which  allows  thaa  to  ba  updated  by  the 
Central  Co^Hitar  Complex  (COC)  and  to  ba  accessed  by  tha  display 
generators.  In  tha  baseline  rahost  configuration,  ona  large  aeaory  would 
serve  both  tha  CCC  functions  as  wall  as  tha  display  functions.  If  tha 
display  buffers  ware  to  ba  resident  in  tha  aainfraaa  memory,  tha  frequency 
and  raaponaa  raquiraaants  for  tha  display  generator  accassas  to  these 
buffers  would  significantly  degrade  tha  overall  aaaory  performance  for  tha 
remainder  of  tha  systaa.  An  alternative  approach  for  resolving  this 
potential  aainfraaa  problaa  is  to  provide  a  capability  between  the  aainfraaa 
and  the  display  ganarators.  That  is,  tha  display  buffers  would  ba 
aaintainad  in  special  purpose  aaaory  units  that  could  ba  loaded  and  updated 
by  tha  aainfraaa  and  aocassad,  as  necessary,  by  the  display  ganarators. 
Again,  this  approach  would  ensure  adequate  capacity  to  service  tha  display 
generators  and  avoid  potential  aainfraaa  parforaanca  problems. 
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